perm filename NOTES.JRA[LSP,JRA]5 blob sn#110746 filedate 1974-07-09 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00049 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00006 00002	.SEC(Evaluation of LISP expressions,Evaluation)
C00018 00003	.SS(Sexpr translation of LISP expressions)
C00025 00004	.SS(A first look at symbol tables,symbol tables)
C00029 00005	.SS(Introduction to %3λ%*-expressions)
C00034 00006	.SS(%3λ%*-expressions,%3λ%*-expressions,P49:)
C00044 00007	.SS(Mechanization of evaluation,,P6:)
C00055 00008	.SS(Examples of %3eval%*,examples of %3eval%*)
C00059 00009	It should now be clear that %3eval%* and friends %2do%* begin to
C00064 00010	.<<the |   | symbol table >>
C00070 00011	.SS(Environments,,P77:)
C00077 00012	.REQUIRE "PROB8.PUB" SOURCE_FILE
C00078 00013	.SS(%3label%*,%3label%*,P38:)
C00081 00014	.SS(functional arguments,functional argument,P76:)
C00092 00015	.<<pic of funarg execution>>
C00098 00016	.SS(special forms,special form,P8:)
C00104 00017	.SS(The %3prog%*-feature,,P37:)
C00118 00018	.SS(In retrospect)
C00126 00019	.SS(Towards a Mathematical Semantics)
C00130 00020	.SEC(Running on the machine)
C00136 00021	.SS(%3evalquote%*,%3evalquote%*)
C00144 00022	.SS(A project: extensions to %3eval%*,%3eval%*,P67:)
C00153 00023	.SS(A project: Syntax-directed processes,syntax-directed)
C00154 00024	.SEC(Toward better symbol tables and implementation,symbol tables)
C00166 00025	.<<atom struct for fact>>
C00177 00026	.<<pic of NIL>>
C00178 00027	.SS(A first look at %3cons%1)
C00190 00028	.SS(%3read%* and %3print%*,,P13:)
C00197 00029	.SS(Hashing,hashing,P14:)
C00204 00030	.SS(A review of the structure of  the LISP machine,LISP machine)
C00206 00031	.SS(More on symbol tables,symbol tables)
C00210 00032	.SS(Global Variables,global variables)
C00214 00033	.SS(AMBIT/G,AMBIT/G,P26:)
C00223 00034	.<<pic of AMBIT/G for car>>
C00226 00035	On following pages are AMBIT/G algorithms for →%3cdr%*↔←, →%3cons%*↔←, 
C00229 00036	.SS(garbage collection again,garbage collection)
C00236 00037	.CENT(A simple LISP garbage collector in AMBIT/G)
C00237 00038	.CENT(A simple LISP garbage collector written in LISP)
C00242 00039	.CENT(A link-bending garbage collector for LISP)
C00246 00040	.SS(Syntactic dominoes)
C00247 00041	.SS(The Contour Model,Contour Model,P31:)
C00250 00042	.SS(Macros and special forms,macros,P18:)
C00266 00043	.SS(%aSM%*-%3eval%*,%3eval%*,P30:)
C00272 00044	.SS(Binding revisited,bind,P78:)
C00280 00045	.<<PROBS OF BINDING AND FEXPRS>>
C00282 00046	.SS(Review)
C00286 00047	.SS(%aSM%*:A Simple machine,%aSM%*,P33:)
C00293 00048	.SS(An alternative: deep bindings,deep binding)
C00301 00049	.END "JRA"
C00302 ENDMK
C⊗;
.SEC(Evaluation of LISP expressions,Evaluation)
.BEGIN "JRA" TURN ON "←","#";
.SS(Introduction)
%1
We shall now begin to look very closely at the intuitive
process which we have been using in the ⊗→evaluation↔← of LISP expressions.
Why? First in an abstract sense,
that is the business of programming languages--evaluation of functions.
They aren't called procedure-oriented languages for nothing!  Also
function evaluation is a basic tool of evaluation of LISP functions.
In fact a very general kind of evaluation process is going on in LISP,
that is evaluation of recursive functions.  Later we will
study the mechanisms which will support recursive evaluation.

First, as we mentioned in regard to the polynomial arithmetic
examples the intuitive mathematical operations say nothing about the process
of evaluation.  %3(2*3) + (5*6)%* evaluates to %336%* regardless
of whether you evaluate %32*3%* first or %35*6%* first, or whether you do them in parallel.
Probably if one were forced to explicate mathematical
evaluation, we would have to say parallel evaluation.

Already on {yon(P17)} we have seen that order of 
evaluation in conditional expressions can make a difference, 
and certainly when we try to mechanize LISP
function evaluation on a sequential machine we %2will%* have to choose
%2some%* order of evaluation.  We must also make some decision regarding
the order of evaluation of the arguments to a function call,
say %3f[x%41%*;x%42%*; ...x%4n%1].  We will assume that we will evaluate the arguments
from left to right.  The evaluation scheme which we choose
is called %2⊗→inside-out↔←%* or %2⊗→call-by-value↔←%*.  Alternative proposals exist
and have their merits;  ⊗→call-by-name↔← (outside-in) evaluation is another
well-known scheme.  
Here again, the style of evaluation can make a difference. Consider the example due
to J. Morris:

%3f[x;y] <= [x = 0 → 0; T → f[x-1;f[y-2;x]]]]%*. Evaluation of %3f[2;1]%*
will terminate if we always evaluate the leftmost-outermost occurrence of %3f%*.
Thus:

←%3f[2;1] = f[1;f[-1;2]] = f[0;f[f[-1;2]-2;-1] = 0;%*

However, if we evaluate the rightmost-innermost occurrences first the
computation will not terminate.

Intuitively, call-by-value says:
evaluate the arguments to a function before you attempt to apply the arguments 
to the function definition.  Here are some examples:
Let %3f[x;y]%*  be %3x%82%* + y%1 and consider %3f[3+4; 2*2]%*.  Then
call-by-value says evaluate the arguments (%37%*#and#%34%*) associate those
with the formal parameters of %3f%*(i.e. %3x:7 ;y:4%*) and evaluate the body of
%3f%* (i.e.#%37%82%*#+#4#=#57%1); call by name says pass the unevaluated actual 
parameters to the function, then perhaps evaluate.  I.e. %3(3+4)%82%*#+#2*2%1
arises (which also evaluates to %357%*). We will say more about other styles of
evaluation later. However most of this section will be restricted to call-by-value.

Notice that the process of evaluation by value
is recursive.  That is, it goes something like this:
If the expression to be evaluated is a constant then the value
(of the expression) is that constant.  (the value of %33%* is %33%*).
If the expression is a variable the see what the current value associated
with that variable is. (Within the evaluation of say
%3f[3;4]%* where %3f[x;y]#<=#x%82%*#+#y %1the current value of the variable
%3x%* is %33%*.)  The only other kind of expression that we can have is a function
name followed by arguments. (Say %3f[3;4]%*).  In this case we evaluate the arguments,
that is, we %2recur%*  and then apply the definition of the function to those
evaluated arguments.  How do you apply the evaluated arguments to the function
definition?  You associate or ⊗→bind↔← the formal parameters (or variables)
of the definition to the values of the actual parameters.
Then you evaluate the body of the function in this new environment.  Why all the emphasis 
on the mechanism of function evaluation?  For one reason implementation
of programming languages requires careful consideration of this
problem.  If you understand the formal model of LISP and the corresponding
model of implementation, you should have an exceptionally
clear picture of the problems of implementation of languages, and you
will understand many of the problems of semantics of languages.

.P80:
How do we transcribe the evaluation of arithmetic functions to LISP functions.
It's easy.  First, the class of LISP constants is larger
(than the integers)  What are the constants of LISP? They're just the Sexprs. 
(The  value of %3(A#.#B)%* is %3(A#.#B)%* --- just like the value of %33%* is %33%*
⊗↓We are ignoring the distinction between the %2numeral%* %33%* and the %2number%3 3.%1←).
The additional artifact of LISP which we have to talk about with respect to
evaluation is the conditional expression. But clearly its evaluation can also
be precisely specified. See {yon(P40)}.
In more specific detail here is some of the structure of the
LISP evaluation mechanism:

If the expression to be evaluated is a constant then the value
is that constant.

If the expression is a variable find its value in the current environment.

If the expression is a conditional then it is of the form
%3[p%41%* → e%41%*; p%42%* → e%42%*; ... p%4n%* → e%4n%1].  Evaluate it
using the semantics defined on {yon(P40)}.

If the expression is of the form: function name followed by 
arguments then:

.BEGIN TABIT1(5);
\%21.%1 evaluate the arguments (from left to right)
\%22.%1 find the definition of the function
\%23.%1 associate the evaluated arguments with the formal parameters in the function definition
\      That is modify the environment of variables their values.
\%24.%1 evaluate the body of the function.
.END

So we %2can%* describe the outline of a mechanical procedure which
can be used in the evaluation of LISP functions.  Recall our  discussion
of the differentiation formulas in {yonss(P41)}.  We described the mechanism of 
differentiating as a recursive process.  Then mechanized that scheme
as a LISP function, translating the polynomials to list notation,
since the domain of LISP functions is Sexprs.  We are in a similar
situation here.  It should be clear that we could write a LISP
function representing the evaluation process provided that we can
find a representation for LISP expressions as Sexpressions. 
This mapping of LISP expressions to Sexprs is our first order of business. We 
will accomplish this mapping by using an extension of the scheme introduced in
{yonss(p41)}.

This whole process of mapping LISP expressions onto sexprs, and writing a LISP 
function to act as an evaluator may seem a bit incestuous or difficult to fathom.
First, it is no more obscure than the differentiation example.
It is just another instance of the diagram of {yon(P46)},
only now we are applying the process to LISP itself.  The effect is to force
us to make precise exactly what is meant by LISP evaluation. This
precision will have many important ramifications.

Second, we've been doing evaluation of sexpr representations of
LISP expressions already.
The %2great mother of all functions%* is exactly the evaluation mechanism for 
the LISP primitive functions and predicates, %3car, cdr, cons, atom%* and %3eq%*
when restricted to functional composition and constant arguments.
The %2great mother revisited%* is the extension to conditional expressions.

But let's stop for some examples of translating LISP functions into Sexpr notation.


.SS(Sexpr translation of LISP expressions)

.BEGIN NOFILL;TABS 25,50;TURN ON "\";
.GROUP

\%2LISP expression\translation as Sexpr

%1we will represent numbers just as numbers, e.g.:
\%32\2

%1we will translate identifiers to their upper-case counterpart. Thus:
%3\x\X
\y2\Y2
\car\CAR
.APART
.BEGIN FILL;


%1Now we've got a problem: 
We wish to map arbitrary LISP expressions to Sexpressions. The LISP expression,
%3x%* translates to %3X%*; %3X%* is itself a LISP expression (a constant); 
%2it%* must also have a translate.

We must be a little careful here. 
When we write Son of Great Mother we will give it an sexpr representation
of a form to be evaluated. We might give it the representation of %3car[x]%*
in which case the value computed will depend on the current value bound to %3x%*.
We might also give the representation of %3car[X]%*; in this case we should
expect to be presented with an error message.
Or, for example some function %3foo%* we wish to write may return the sexpr
representation of a LISP form as its value.
Say %3foo[1]%* returns the representation of %3car[x]%* and %3foo[2]%* returns
the representation of %3car[X]%*. We must be able to distinguish between these
representations.
That is, given the representation,
there should be exactly one way of interpreting it as a LISP expression.
The mapping must be 1-1. So we must represent %3x%* and %3X%* as %2different%* sexprs.
The translation scheme we pick is:
for any sexpression, y, its translation is %3(⊗→%3QUOTE%*↔←%1 y%3)%1.
.END
.GROUP
.SELECT 1;
For example:

\%3X\(QUOTE X)
\(A . B)\(QUOTE (A . B))
\QUOTE\(QUOTE QUOTE)

.APART
.GROUP
%1
.BEGIN FILL;
We must also show how to map expressions of the form %3f[e%41%*#;#...#;e%4n%*]%1
onto  sexprs.
We have already seen one satisfactory  mapping for functions in prefix form
in {yonss(P41)}.
We will use that mapping (called Cambridge Polish) here. That is:
.END
\%3f[e%41%*;e%42%*; ...e%4n%*]\(F e%41%*' e%42%*' ...e%4n%1')
\\  where %3e%4i%1' is the translate of %3e%4i%3
%1Examples:%*
\car[x]\(CAR X)
\car[X]\(CAR (QUOTE X))
\cons[cdr[(A . B)];x]\(CONS (CDR (QUOTE (A . B))) X)
.APART
.GROUP

%1What's left?   conditional expressions. The mapping is self-explanatory.
\%3[p%41%* → e%41%*; ... p%4n%* → e%4n%*]\(⊗→%3COND%*↔← (p%41%*' e%41%*') ... (p%4n%*' e%4n%*'))
%1and an example:%3
\[atom[x] →1; q[y] → X]\(COND ((ATOM X) 1) ((Q Y) X))

%1
.APART
.END
.P44:
Notice that %3(COND#(#...#))%* and %3(QUOTE#...#)%* %2look%* like translates of function
calls of the form %3cond[#...#] %* and %3quote[#...#]%*. This is %2not%* the case,
%3QUOTE%* and %3COND%* are not translates of LISP functions.
The constructs %3QUOTE%* and %3COND%* do not use call-by-value evaluation and
are handled in a special way as we shall soon see.

Since %3T%* and %3NIL%* are used so frequently, we chose their
translates to be %3T%* and %3NIL%* rather than %3(QUOTE#T)%* and %3(QUOTE#NIL)%*.

You should now go back and look at the %2tgm%*'s ({yonss(P39)}) again now 
that you know
what they mean.  You should note that with respect to atoms, the great
mothers are only defined for %3T%* and %3NIL%*.  Other atoms (which are
the translates of variables) are not recognized and return a h.s. message.
The question of variables requires some careful study.  Before we do that,
let's see what the great mothers actually accomplish.
First, tgm's explicitly tell you what the meaning of various subsets
of LISP is.  That is, the tgms explicitly codify the semantics (or meaning)
of LISP.  Next, if you are able to code tgm on a machine, then you can
input the Sexpr form of LISP expressions and they will be evaluated for you.
%2Great mother does it all.%*


The part of our informal argument which is clearly missing
 in the tgms is the details of handling variables and the names of functions.  That is,
how do we describe the binding of variables to values as a mechanical process
and, given the name of a defined function, how do we retrieve the actual definition?
These are actually the same problem: that of symbol table manipulations.


.SS(A first look at symbol tables,symbol tables)
%1

One of the distinguishing features of Computer Science is
its reliance on the ability to store and recover information.
Any formalism which proports to address itself to Computer Science
must take this into account.  In particular we must be able to
handle the effect of assignment or binding, that is, the association
of a value with a name in an environment.  A common device used to accomplish this
is a symbol table. In essence, a symbol table is simply a set of ordered pairs
of objects (a function or relation, if you wish).  One of the
elements of each pair is a name;  the other is considered to be
a value.  We will come back to symbol tables in a while to study 
the problems of efficent storage and retrieval of information. 
It will suffice now simply to think of a symbol table as represented
in LISP by a list of dotted pairs --- name dotted with value---.
These simple symbol tables are also know as ⊗→association lists↔←
or %2⊗→a-lists↔←%*.

Since we are encoding the %2set%* of objects as a list, we are actually
encoding it as a %2sequence%*. It will be quite useful to exploit this
ordering imposed in the sequence.

.GROUP
Recall that we are representing variables as atoms; then if %3x, y, %*and %3z%*
had current values %32, 3, %*and %34%*, then a symbol table describing that fact could
be encoded as:

←%3((X . 2)(Y . 3)(Z . 4)) .%1
.APART
.GROUP

In the sequel, we will need to add elements to a table and, given a name, find
the corresponding value.
Here's a simple symbol-table-searching function, ⊗→%3assoc%*↔←:
.BEGIN TABS 10,25;TURN ON "\";NOFILL;
%3

\assoc[x;l] <= \[null[l] → NIL;
\\ eq[caar[l];x] → car[l];
\\ T → assoc[x;cdr[l]]]

%1e.g.%*
←assoc[Y;((X . 2)(Y . 3)(Z . 4))] = (Y . 3)
←assoc[U;((X . 2)(Y . 3)(Z . 4))] = NIL

.END
.APART
%1
%3assoc%* will examine the table for left to right, looking for the first
pair whose %3car%*-part matches the given atom. If such a pair is found,
then that pair is returned.  If %2no%* such pair is found, the
result is %3NIL%*.  Notice there may be more than one pair with the same
%3car%*-part; only the %2first%* is found.
.SS(Introduction to %3λ%*-expressions)
How about adding things to tables?  We will see that %3cons%* plus
the recursion mechanism will automatically update the symbol table
for %2Great-mother%* when we call a function.
We will be adding entries to the symbol table when we call a function, 
binding the formal parameters to the actual parameters. Intuitively,
when we call %3f[x;y] <= (x*y + y) %1in%3 f[2;3]%1, the dotted pairs
%3(X . 2) %*and %3 (Y . 3)%* should find their way into the table.  

Besides the binding of formal parameters to actual values, there is
another binding process occurring in LISP.
Before we can call the function named %3f%*, its definition must 
be in the symbol table; so there should be an entry (dotted pair) 
with name-part %3F%* and a value-part representing %3(x*y + y)%*.  An obvious 
representation is perhaps,

←%3(PLUS (TIMES X Y)Y)%*

That's not quite good enough. If we simply associate the name %3F%*
with the list %3(PLUS#(TIMES#X#Y)#Y)%*, we will have lost some crucial information.
Namely, the relation between the variables %3x%* and %3y%* and the
body of the function will not be present. For example, %3f[y;x]#<=#(x*y#+y)%* 
is %2not%*
the same function, but in our naive translation they have the same 
representation.  The solution is obvious: we must associate the list of 
formal parameters with the body of the definition.  Perhaps,

←%3((X Y)(PLUS (TIMES X Y) Y)).%*

This is the right approach and it can be made precise.
There is a very elegant and beautiful theory behind this intuitive
discussion.  We will sketch a few parts which will be helpful, but
first let's see where we are:

.SS(Review of evaluation)

We have seen that it is possible to mechanize at least one 
scheme for evaluation of functions -- left-to-right and call-by-value.
We have seen that it is possible to translate LISP expressions
into Sexprs in such a way that we can write a LISP function
which will act as an evaluator for such translates.  In the process
we have had to mechanize the intuitive devices we (as humans) might
use to recall the definition of functions and to recall the current
values of variables.  It was soon clear that the same mechanism
could be used: that of symbol tables.  To associate a variable with
a value was easy: to associate a function name with its definition
required some care.  That is, part of the definiton of a function 
involves the proper association of formal parameters with the
body of the definition. (We actually need to be a bit more
precise than this in describing recursive functions, but this is
good enough for now.)  The device we chose is called the %2lambda
notation%*.

.SS(%3λ%*-expressions,%3λ%*-expressions,P49:)

***more***

The λ-notation is a device invented by the logician, Alonzo Church, in
conjunction with his studies in logic and foundations of mathematics.
It was introduced into Computer Science by John McCarthy in 
the description of LISP.

.BEGIN TURN ON "←";
As we intimiated previously, the λ-notation is used in LISP to allow
us to attach names to functions in an unambiguous manner.
We have been very imprecise when writing %3f[x;y]#<=#(x*y#+#y)%* as 
a definition of the function %3f%*. 
First note that %3f[x;y]%* is %2not%* a function, %3f%* is.
To see what  %3f[x;y]%* means consider the following example.
When we are asked to evaluate %3car[(A . B)]%* we say the value is %3A%*.
%3car[(A . B)]%* is an expression to be evaluated; it is a LISP ⊗→form↔←.
If %3car[(A . B)]%*  %2is%* a form then so is %3car[x]%*; 
only now the value of the
form depends on the current value assigned to the variable %3x%*.
So the %2⊗→function↔←%* is %3car%*; the %2form%* is %3car[x]%*.
The function is %3f%*; %3f[x;y]%* is a form, as is %3x*y#+#y%*.

Also our notation has really been specifying more than just the name.
The notation specifies  the formal parameters (%3x%* and %3y%*) and
the order in which we are to associate actual parameters in a call 
with the formals of the definition (%3x%* with the first, %3y%* with the second).
More subtly, the notation tells %2which%* variables in the function body
are bound variables.  That is, which variables are to expect values when the
function is called. For example define %3g[x]#<=#(x*y#+#y)%*; then the
value of %3g[2]%* is well defined and is itself a function.

We wish to have a notation to assign all this other information with the
function body so that function definitions can be inserted into the symbol
table as "values". They will be parametric values, but they will be values.
The λ-notation performs this task by preceeding the function body with
a list of the bound variables, called %2⊗→lambda variables↔←%*.
The list of variables is usually referred to as the %2⊗→lambda list↔←%*.
Then this resulting construct is preceeded by "λ[" and followed by "]".
Thus using the above example, the function %3f%* is exactly the
same function as %3λ[[x;y]#x*y#+#y]%*. 
We actually need a bit more than 
λ-notation to specify recursive functions in a natural manner. See {yonss(P38)}.

The λ-notation introduces nothing new as far as our intuitive binding and
evaluation processes are concerned, it only makes these operations more
precise. So to  evaluate a λ-expression, bind the evaluated actual parameters
to the λ-variables and evaluate the function body.

Since we intend to include  λ-expressions in our language we must include
a translation of them into sexpression form:

.BEGIN GROUP;TABIT2(25,50);
\%2LISP expression\translation as Sexpr%*
\%3λ[[x%41%*; ...; x%4n%*]%9x%*]\(LAMBDA (X%41%* ...  X%4n%*) %1translate of %9x%3)
.END

Thus the character, λ, will be translated to %3LAMBDA%*. %3LAMBDA%* is therefore a
kind of prefix operator (like %3COND%*) which allows the evaluator to recognize
the occurrence of a function definition (just as %3COND%* is used by the
evaluator to recognize the occurence of a (translate of a) conditional
expression).

Here are some examples of %3λ%*-expressions and their evaluation,
followed by some examples of the translation scheme:
.BEGIN NOFILL TABS 10;TURN ON "\";

\%2a.  %3λ[[x;y] x%82%* +y][2;3] = 2%82%* + 3 = 7
\%2b.  %3λ[[x;y] cons[car[x];y]][(A . B); C] = (A . C)
\%2c.  %3λ[[y] cdr[y]][(A B)] = (B)
\%2d.  %3λ[[x] car[x]][λ[[y] cdr[y]][(A B)]]
\     = λ[[x] car[x]][(B)] 
\     = B

\%2e.  %3λ[[x;y] x%82%* +y]  ==> (LAMBDA(X Y)(PLUS (EXPT X 2) Y))
\%2f.  %3λ[[x;y] cons[car[x];y]]  ==> (LAMBDA(X Y)(CONS (CAR X) Y))
.END

******MORE*******

Our LISP syntax equations should now be augmented to  include:

.BEGIN TABIT1(11);GROUP;

<function>\::= λ[<varlist><form>]

<varlist>\::= [<variable>; ... ; <variable>]
\::= [ ]

.END
One of the other more valuable application of λ-notation occurs in that
nebulous area called efficiency.  Efficiency is like structured programming -- 
difficult to define but recognizable when it occurs.

Consider the following sketch of a function definition:

←%3g <= λ[[x][p[lic[x]] → lic[x]; .... x ...]]%*.

Where %3lic%* may be a %2l%*ong %2i%*nvolved %2c%*alculation.
We certainly must compute %3lic[x]%* %2once%*. But as %3g%* is defined,
we would compute %3lic[x]%* %2again%* as e%41%* if p%41%* is true. Since our
current LISP subset has no side effects, this second calculation
is unnecessary. %3g%* is inefficient. Using λ-expressions, in a style
called %2⊗→internal lambdas↔←%* we can improve %3g%* as follows:

Replace the body of %3g%* with,

←%3λ[y][p[y] → y; ... x  ...][lic[x]]%*.    (*1*)

Call this new function %3g'%*:

.BEGIN TABIT2(10,23);GROUP;
\%3g' <= λ[[x]
\\λ[y][p[y] → y; ... x ... ][lic[x]]
\\      ] %1.
.END

Now when %3g'%* is called we evaluate the actual parameter, binding it
to %3x%*, as always; and evaluate the body, (*1*). Evaluation of (*1*)
involves calculation of %3lic[x]%* (%2once%*), binding the result to
%3y%*. We then evaluate the body of the conditional expression as before.
If p%41%* %2is%* true, then this definition of %3g'%* involves
one calculation of %3lic[x]%* and two table look-ups (for the value of %3y%*),
rather than the two calculations of %3lic[x]%* in %3g%*.
More conventional programming languages can obtain the same effect as
this use of internal lambdas by assignment of %3lic[x]%* to an
temporary variable. We will introduce assignment statements in LISP in
{yonss(P37)}.
.END

.BEGIN CENTERIT;SELECT 2;
←Problems

.END

I  What is the difference between %3λ[[ ] x*y + y]%* and %3x*y + y%*?

****more probs****

.SS(Mechanization of evaluation,,P6:)
%1

As we have said, %3tgmoaf%* and %3tgmoafr%* ({yonss(P39)}) are evaluators 
for subsets of LISP.
Armed with your symbol-table mechanism we could now extend the
%2great-mothers%* to handle variable look-ups.
Rather than do this we will display our burgeoning cleverness
and make a total revision of the structure of the
evaluators. In making the revision, the
following points should be remembered:

1.	Expressions (to be evaluated) can contain variables, both simple
variables and variables naming λ-expressions. Thus evaluation must be done
with respect to an environment or symbol table. We wish to recognize other
(translates of) function names besides %3CAR, CDR, CONS, EQ%* and%3 ATOM%* 
in our evaluator, but
explicitly adding  new definitions to the evaluator in the style of the
recognizers for the five primitives is a bad idea.  Our symbol table should
hold the function definitions and the evaluator should contain the general
schemes for finding the definitions, binding variables to values, and
evaluating the function body.  By now you should be convinced that this
process is a reasonably well defined task and could be written in LISP.

2.	All %2function%* calls are to be evaluated by-value. However there are
some %2⊗→special forms↔←%* we have seen which are not evaluated in the 
normal manner.  In particular, conditional expressions, quoted expressions,
and lambda expressions are handled differently.

Abstractly, we might write ⊗→%3sogm%*↔← (son of great mother) as:

.BEGIN SELECT 3;GROUP;TABIT1(7);

sogm <= λ[[exp;environ]
\[isvar[exp] → value[exp;environ];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfunc+args[exp] → applyfun[func[exp];list_of_sogmed_args[arg[exp];environ];environ]]

.END
.BEGIN SELECT 3;GROUP;TABIT1(15);

%1where:%*
applyfn <= λ[[fn;args,environ]
\[iscar[fn] → car[arg%41%*[args]];
\ iscons[fn] → cons[arg%41%*[args];arg%42%*[args]];
\       ....               .... 
\ islambda[fn] → sogm[body[fn];newenviron[vars[fn];args;environ]]
\      ....          ..... ]]

.GROUP
%1and:%*
denote <= λ[[exp]
\[isnumber[exp] → exp;           %1(see footnote {yon(P80)})%3
\ istruth[exp] → exp;
\ isfalse[exp] → exp;
\ isSexpr[exp] → rep[exp]]

.END

%3list_of_sogmed_args%* is a function to perform left-to-right evaluation of the
argument list. %3newenviron%* is a function to manufacture a new environment
reflecting a rebinding of the λ-variables to the new values. %3func%* is a 
selector to pick out the function from the representation of the form; %3arg%*
selects the argument list from the representation. 
This abstract sketch is not meant to be complete, but rather only to
whet the intuition.

With this short introduction we will now write a more general evaluator
which will handle a larger subset of LISP than the %3tgm%*s.

The primary function is named ⊗→%3eval%*↔← (rather than Son of Great Mother). 
It will take two arguments; the first
is a representation of an expression to be evaluated, and the second
is a representation of a symbol table. %3eval%* will recognize numbers, the
constants %3T%* and %3NIL%*, and if presented with a variable, will attempt to 
find the value of the variable in the symbol table.

%3eval%* also handles the previously mentioned special forms. If %3eval%* sees
a ⊗→conditional expression↔← (represented by %3(COND ...)%* ) then the body
of the ⊗→%3COND%*↔← is passed to a subfunction named ⊗→%3evcond%*↔←. 
%3evcond%* encodes the semantics as described on {yon(P40)}. The special
form, ⊗→%3QUOTE%*↔←, signifies the occurrence of a constant. Simply return
that constant.
As far as this %3eval%* is concerned, any
other expression is a function-call. In this case we %2apply%* the function
to the list of evaluated arguments. The function may either be a name or
it may be a λ-expression. %3apply%* handles both cases.

.BEGIN NOFILL;TURN ON "\";TABS 16,27;
.GROUP
Here's the new %3eval%*:
%3
⊗→eval↔← <= λ[[e;a]\[atom[e] →\[numberp[e] → e;
\\ eq[e;T] → T;
\\ eq[e;NIL] → NIL;
\\ T → cdr[assoc[e;a]]];
\ atom[car[e]] →\[eq[car[e];QUOTE] → cadr[e];
\\ eq[car[e];COND] → evcond[cdr[e];a];
\\ T → apply[car[e];evlis[cdr[e];a];a]]
\ T → apply[car[e];evlis[cdr[e];a];a]]]
.APART
.GROUP
%1where:%*

evcond <= λ[[e;a][eval[caar[e];a] → eval[cadar[e];a];
\ T → evcond[cdr[e];a]]]
.APART
.GROUP
%1and,%*

evlis <= λ[[e;a][null[e] → NIL;
\ T → cons[eval[car[e];a];evlis[cdr[e];a]]]]
.APART
%1
.END

The subfunctions, %3evcond%* and %3evlis%*, are simple.  %3evcond %*you've seen
before in %3tgmoafr%*; %3evlis%* simply manufactures a new list consisting of the
results of evaluating (from left to right) the elements of %3e%*, using the
symbol table, %3a%*, where necessary.

The interesting innovations appear in the function, %3apply%*.
⊗→%3apply%*↔← takes three arguments: a function, a list of evaluated arguments,
and a symbol table. %3apply%* explicitly recognizes the five primitive functions %3CAR,
CDR, CONS, EQ, %*and%3 ATOM%*. If the function name is a variable,
(represented as an atom) then the
definition is located in the symbol table by %3eval%* and applied to the arguments.
Otherwise the function must be a λ-expression.  This is where things
get interesting. We know what we must do: evaluate the body of the
λ-expression after binding the formal parameters of the λ-expression
to the evaluated arguments. How do we ⊗→bind↔←?  We add variable-value
pairs to the symbol table. We will define a subfunction, %3pairlis%*, to
perform the binding.  Then all that is left to do is give the function
body and the new symbol table to %3eval%*.  Here is %3apply%*:

.BEGIN NOFILL;TURN ON "\";TABS 20,28;
.GROUP
%3
.P69:

⊗→apply↔← <= λ[[fn;x;a][atom[fn] → [\eq[fn;CAR] → caar[x];
\\eq[fn;CDR] → cdar[x];
\\eq[fn;CONS] → cons[car[x];cadr[x]];
\\eq[fn;ATOM] → atom[car[x]];
\\eq[fn;EQ] → eq[car[x];cadr[x]];
\\T → apply[eval[fn;a];x;a]];
\eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a]] ]]
.APART
.GROUP

pairlis <= λ[[v;w;x][null[w] → x;
\T → cons[cons[car[v];car[w]];pairlis[cdr[v];cdr[w];x]]]]

.APART
%1
.END
So the only really new development is in the λ-binding process.
Notice that %3pairlis%* makes a new symbol table by consing new pairs onto
the front of the old symbol table; and recall that %3assoc%* finds the
%2first%* matching pair  in the symbol table. %2This is important.%*


.BEGIN CENTERIT;GROUP;
←%2Problems

I  %1Complete the abstract versions of %3sogm%* and Co.
.END

.SS(Examples of %3eval%*,examples of %3eval%*)

After this onslaught, some examples are in order.
Let's evaluate %3f[2;3]%* where %3f <= λ[[x;y] x%82%* + y].%1
After appropriate translation this is equivalent to evaluating:

←%3eval[(F 2 3);((F . (LAMBDA(X Y)(PLUS (EXPT X 2)Y))))]%*

Notes:

%21.%*  %3((F . (LAMBDA (X Y) ... ))) = ((F LAMBDA (X Y) ... )) %*

%22.%*  Since the symbol table, %3((F ...))%* occurs so frequently in the following
trace, we will abbreviate it as %3st%*. Note that we have no mechanism yet
for permanently increasing the repetoire of known functions.  We must
resort to the subterfuge of initializing the symbol table to get the function
%3f%* defined.

%23.%*  For this example we must assume that + and ↑ (exponentiation) are known functions.  Thus
apply would have to contain recognizers for %3PLUS%* and %3TIMES%*:

.BEGIN NOFILL;
.GROUP
%3
	...atom[fn] → [eq[fn];PLUS] → +[car[x];cadr[x]];
			eq[fn;EXPT] → ↑[car[x];cadr[x]]; 
			 ...
.APART

%1
.END
.BEGIN NOFILL;TABS 10,25;TURN ON "\";
.GROUP
So %3eval[(F 2 3);st]
\= apply[car[(F 2 3)]; evlis[cdr[(F 2 3)];st];st]
\= apply[F;evlis[(2 3);st];st]
\= apply[F;(2 3);st]
\= apply[eval[F;st];(2 3);st]
\= apply[(LAMBDA(X Y)(PLUS (EXPT X 2) Y)); (2 3);st]

\= eval[caddr[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];
\        pairlis[cadr[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];(2 3);st]]

\= eval[(PLUS (EXPT X 2) Y);pairlis[(X Y);(2 3);st]]
\= eval[(PLUS (EXPT X 2) Y);((X . 2)(Y . 3)(F LAMBDA(X Y) ...))]

\= apply [PLUS; evlis[((EXPT X 2) Y);((X . 2)(Y . 3)..)];((X . 2)...)]

\     %1Let's do a little of:%3  evlis[((EXPT X 2)Y);((X . 2)(Y . 3)...)]
\\= cons[eval[(EXPT X 2);((X . 2)(Y . 3) ...)];
\\         evlis[(Y);((X . 2) ...)]]
\\= cons[apply[EXPT;evlis[(X 2);((X . 2)...)];((X . 2) ...]
\\         evlis[(Y); ...]]

\\= cons[apply[EXPT;(2 2);((X . 2) ...];evlis[(Y); ...]]
\\= cons[↑[car[(2 2)];cadr[(2 2)]];evlis[(Y); ... ]]
\\= cons[↑[2;2];evlis[(Y); ... ]]
\\= cons[4;evlis[(Y);((X . 2)(Y . 3) ....)]]
\\= cons[4;cons[eval[Y;((X .2) ...)]; evlis[NIL;(( ...))]]]
\\= cons[4;cons[3;NIL]]
\\= (4 3)  %1 which you knew anyway.

Now back to apply:
%3
\= apply[PLUS;(4 3);((X . 2)(Y . 3) ... )]
\= +[4;3]
\= 7	%1which you also knew.
.APART
.END

It should now be clear that %3eval%* and friends %2do%* begin to
perform as you would expect.  It perhaps is not clear that a
simpler scheme might do as well.  In particular, the complexity
of the symbol table mechanism which we claimed was so important
has not been exploited. The next example will indeed show that a
scheme like ours is necessary to keep track of variable bindings.
.GROUP

Let's sketch the evaluation of %3fact[3]%* where:

←%3fact <= λ[[x][x = 0 → 1;T → *[x;fact[x-1]]]].%*

.P42:
That is, %3eval[(FACT 3);st] %*where %3st%* names the initial symbol table,

←%3((FACT LAMBDA(X)(COND((ZEROP X)1)(T(TIMES X(FACT (SUB1 X))))))).%1
.APART

In this example we will assume that the binary function, *, the 
unary predicate, %3zerop#<=#λ[[x]x#=#0]%* and unary function, 
%3 sub1#<=#λ[[x]#x-1]%* are known and are 
recognized in the evaluator by %3TIMES, ZEROP%* and %3SUB1%* respectively.

.BEGIN NOFILL;TURN ON "\";TABS 10,20,30;
.GROUP

%1Then%3 eval[(FACT 3);st]
\= apply[FACT;evlis[(3);st];st]
\= apply[(LAMBDA(X)(COND ...));3;st]
\= eval[(COND((ZEROP X)1)(T( ...)));((X . 3) . st)]
\= evcond[(((ZEROP X)1)(T(TIMES X(FACT (SUB1 X)))));((X . 3) . st)]
%1(Now, let %3st1%* be%3 ((X . 3) . st) )
\= eval[(TIMES X(FACT (SUB1 X))); st1]
\= apply[TIMES;evlis[(X (FACT (SUB1 X))); st1];st1]
\= apply[TIMES;cons[3;evlis[((FACT (SUB1 X))); st1];st1]

%1Now things get a little interesting inside %*evlis:
\evlis[(((FACT (SUB1 X)));st1]
\\= cons[eval[(FACT (SUB1 X)); st1];NIL]
\\%1 and %* eval[(FACT (SUB1 X));st1]
\\\= apply[FACT;evlis[((SUB1 X));st1];st1]
\\\= apply[FACT; (2);st1]
\\\= apply[(LAMBDA (X)(COND ...));(2);st1]
\\\= eval[(COND((ZEROP X) 1) ...));((X . 2) . st1)]
\\\...
.APART
.FILL
%1

Notice that within this latest call on %3eval%* the symbol-table-searching function,
%3assoc%*, will find the pair %3(X . 2)%* when looking for the value of %3x%*. This is
as it should be.  But notice also that the older binding, %3(X . 3)%*, is still 
around in the symbol table %3st1%*, and will become accessible once we complete
this latest call on %3eval%*.
.GROUP

As the computation continues, the symbol table is proceeding initially from,
.NOFILL
%3
←st =((FACT LAMBDA(X)(COND ...))) %1to%3,
←((X . 3) . st) %1to%*,
←((X . 2)(X . 3) . st) %1to%*,
←((X . 1)(X . 2)(X . 3) . st) %1to finally%*,
←((X . 0)(X . 1)(X . 2)(X . 3) . st).
%1
.FILL
.APART

Since the search function, %3assoc%1, always proceeds from left to right through
the table and since the table entry function, %3pairlis%*, always %3cons%*es  pairs
onto the front of the table before calling %3eval%*, we will get the correct
binding of the variables.


This symbol table mechanism is very important, so let's look at it
again in a slightly different manner.

.END

.<<the |   | symbol table >>
.P43:
We will represent calls on %3eval%* informally, using LISP expressions
rather than Sexprs as input.  We represent the symbol table between vertical
bars, "|", in such a way that if a table, t%41%*, is:
.BEGIN TABIT2(10,17);

\|  b%4n%*\|
\|   ...\| then %3cons%*ing a new element, b%4n+1%* onto t%41%* gives:
\|  b%41%*\|


\|  b%4n+1%*\|
\|  b%4n%*\|
\|   ...\|
\|  b%41%*\|

.END


%3

.BEGIN NOFILL;TABS 10,46,58;TURN ON "\";
%1Then:%*

eval[`fact[3]'; |fact = λ[[x][x=0 → 1;T → *[x;fact[x-1]]]]|]

.GROUP
\= eval[`λ[[x][x=0 → 1; T → *[x;fact[x-1]]]];\|x = 3         | ]
\\|fact = λ[[ ...  |

.APART
.GROUP
\= *[3;eval[`λ[[x] ....'; \|    x = 2      |]
\\|    x = 3      |
\\|fact = λ[[ ...] |


.APART
.GROUP
\= *[3; *[2;eval[`λ[[x] ... ';\|    x = 1     |]
\\|    x = 2     |
\\|    x = 3     |
\\|fact = λ[[ ... |


.APART
.GROUP
\= *[3; *[2; *[1;eval[`λ[[x] ... ';\|    x = 0     |]
\\|    x = 1     |
\\|    x = 2     |
\\|    x = 3     |
\\|fact = λ[[ ... |

.APART
.GROUP
\= *[3; *[2; *[1;1]]]  %1with:%*\|     x = 1     |
\\|     x = 2    |
\\|      ...         |

.APART
.GROUP
\= *[3; *[2;1]]         %1with:%*\|     x = 2     |
\\|      ...         |

.APART
.GROUP
\= *[3;2]               %1with:%*\|     x = 3     |
\\|     ...          |

\= 6.                   %1with:%*\|fact = λ[[ ... |.

= 6
.END
%1

Notice that after we went to all the trouble to save the old values
of %3x%* we never had to use them.  However, in the general case of 
recursive evaluation we must be able to save and restore the old values
of variables.

For example recall the definition of %3equal:
.BEGIN NOFILL TURN ON "\";TABS 17;


%3
equal <= λ[[x;y][atom[x] → [atom[y] → eq[x;y];T → NIL];
\ atom[y] → NIL;
\ equal[car[x];car[y]] → equal[cdr[x];cdr[y]];
\ T → NIL]]

%1If we were evaluating:%3
←equal[((A . B) . C);((A . B) . D)],
%1
then our symbol table structure would be changing as follows:
.END
%3

.BEGIN NOFILL;TABS 10,40;TURN ON "\";
.GROUP

\|equal = λ[[x;y] ...| ==>\|x = ((A . B) . C)| ==>
\\|y = ((A . B) . D)|
\\|equal = λ[[x;y]... |

\|   x = (A . B)     |\|       x = A        |
\|   y = (A . B)     |\|       y = A        |
\| x = ((A . B). C)|   ==>\|    x = (A . B)    | 
\| y = ((A . B). D)|\|    y = (A . B)    | ==>
\|equal = λ[[x;y ... |\| x = ((A . B). C) |
\\| y = ((A . B). D) |
\\|equal = λ[[x;y] ...|
.APART

\|       x = B         |\|        x = C        | 
\|       y = B         |\|        y = D        |
\|    x = (A . B)     |\|  x = ((A . B). C)  | ==>
\|    y = (A . B)     | ==>\|  y = ((A . B). D)  |
\| x = ((A . B). C)  |\|equal = λ[[x;y] ...  |
\| y = ((A . B). D)  |
\|equal = λ[[x;y] ... |


←|equal = λ[[x;y] ... |
.END
%1

This complexity is necessary, for while we are evaluating
%3equal[car[x];car[y]]%*, we rebind %3x%* and %3y%* but we must save the old 
values of %3x%* and %3y%* for the possible evaluation of %3equal[cdr[x];cdr[y]].%*

But the claim is that using %3pairlis%* to cons the new 
bindings onto the front of the symbol table as we call %3eval%*
does the right thing. That is, in %3apply:

←eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a].

%1
The tricky part is that when we leave that particular call on
%3apply%*, the old table is automatically restored by the recursion
mechanism.  That is, if %3st%* is the current symbol table then
%3cons%*-ing things onto the front of %3st%* doesn't change %3st%*, but if we
call %3eval%* or %3apply%* with a symbol table of say:

←%3cons[(X . 2);cons[(X . 3); st]] %*

then in %2that%* call on  %3eval%* or %3apply%* we have access to %3x = 2%*, not 
%3x = 3%*.

.SS(Environments,,P77:)
.<< the Weizenbaum symbol table>>
This section will introduce one more notation
for describing symbol tables or environments. This notation, due to
J. Weizenbaum, only shows the abstract structure of the symbol table manipulations
during evaluation. Its simplicity will be of great benefit when we introduce
more complex binding schemes necessary for function-valued functions. 
See {yonss(P76)}.
The later work of Bobrow and Wegbreit covers much of this topic but
in more detail than is needed for clear understanding.

An environment will be described as :

.BEGIN TABIT2(36,40);
\\Form
\\E%4i%*
\\|
\  E%4c%*\| E%4a%*
\_________
\var\| value
\\|
\    .....
\\|

.END

Form is the current form being evaluated.
E%4i%* is the name of the current environment or symbol table. The evaluation
of the current form takes place with respect to this environment. Variables
appearing in the form are evaluated in this environment. If the variable is
not found in E%4i%* then E%4a%* is then searched. If the variable is not found
in E%4a%* then the environment mentioned in the upper right-hand quadrant
of E%4a%* is searched. The search will terminate if the variable is found
or if the symbol "/" appears in the right-hand quadrant. In this second
case the variable is undefined. E%4a%* is the beginning of the 
%2⊗→access evnironment↔←%* for global variables.

E%4c%* is called the %2⊗→control environment↔←%*. This environment
is the symbol table to be  restored when the evaluation on the current
form has been completed.  In all examples we have seen so far E%4a%* has
been the same as E%4c%*. Things will soon get more complex.

For now, the notation is used as follows: when we begin, an initial table E%40%* is
set up with "/" in its access and control fields. Execution of "<=" will
add a var-val  entry to the table. When a λ-expression is entered, i.e.,
when we bind the evaluated arguments to the λ-variables, a new E%4i%* is set up
with access and control being the current environment.
Entries reflecting the binding of the λ-variables are made in E%4i%* and
evaluation of the λ-body is begun. When the evalaution is completed old
control environment is restored as the current environment.

First, consider the evaluation of %3f[2;3]%* where %3f#<=#λ[[x;y]x%82%*#+#y]%1.
The flow of symbol tables is:

.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,41,46,49,54,57,60,65,70,73;
.GROUP

\\%3f <= ...\\\f[2;3]\\\x%82%* + y%1
\\E%40%*\\\E%40%*\\\E%41%*\\\E%40%*
\ /\| /\\ /\| /\\E%40%*\| E%40%*\\ /\| /
\______\=>\______\=>\______\=>\______
\\|\  \ %3f\| λ[[x;y]x%82%* + y]%1\  \%3x\| 2\  \f\| λ[[x;y] ...
\\\\\\\ y\|3%1

.APART
.FILL;
The execution of %3fact[3]%* on {yon(P42)} results in a more interesting
example.

.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,31,36,39,44,47,50,55,60,63;
.GROUP
\\%3fact[3]\\\fact[2]\\\fact[1]\\\fact[0]%1
\\E%40%*\\\E%41%*\\\E%42%*\\\E%43%*\\\E%44%*   ...etc.
\ /\| /\\E%40%*\|  E%40%*\\E%41%*\|  E%41%*\\E%42%*\|  E%42%*\\E%43%*\|  E%43%*  ...
\______\=>\______\=>\______\=>\______\=>\______
\%3fact\| λ[[x] ...\ x\| 3\\ x\| 2\\ x\| 1\\ x\| 0   ...%1

Notice in this example we will get the correct binding of %3x%*.
.APART
.END
.GROUP

As a final example using non-local variable bindings consider:

%3f[3]%* where: %3 f <= λ[[x]g[2]]%* and %3g <= λ[[y] x+y]%*.

.NOFILL;
\%3f <= ...; g <= ...\f[3]\\\g[2]\\\x + y     ....
\\E%40%*\\\E%40%*\\\E%41%*\\\E%42%*             ....
\ /\| /\\ /\| /\\E%40%*\| E%40%*\\ E%41%*\| E%41%*
\______\=>\______\=>\______\=>\______\           ......
\\|\\ %3f\| λ[[x] g[x]]\\%3x\| 3\  \y\| 2       ...
\\\\%3g\| λ[[y] x+y]
.END

Notice that when we attempt to evaluate %3x + y%* we find %3y%* has a local
value, but we must look down the access chain to find a binding for %3x%*.







.REQUIRE "PROB8.PUB" SOURCE_FILE;
.SS(%3label%*,%3label%*,P38:)
One effect of placing "λ" and a list of λ-variables in front
of an expresson is to bind those variables which appear both in
the λ-list and the expression. All other  variables
appearing in the expression are %2⊗→free variables↔←%*.
For example, %3f%* is free in the following:

←%3f <= λ[[x][zerop[x] → 1; T → *[x;f[x-1]]] ].%1

Clearly our intention is that the %3f%* appearing the the right of "<="
is the same %3f%* appearing to the left of "<=". That is we want %3f%* to be
a bound variable and not free.

This has not been a problem for us.  We have simply pre-loaded the
symbol table, binding %3f%* to its definition (or value). See {yon(P42)}. LISP
has been equipped with a slightly more elegant device for this binding.
It is called the %3label%* operator. It is called thus:

←%3label[%1<identifier>;<function>]

and has exactly the effect of binding the <identifier> to the <function>.

Note that %3label%* is a special form. To include %3label%* in the LISP syntax add:

.BEGIN TABIT1(11);CENTERIT;

<function>\::= %3label%*[<identifier>;<function>]

and the sexpr translation of the %3label%* construct should naturally be:

←%3(LABEL %1translate of <identifier>    translate of <function> %3).

.END

%2←Problem%1

1.  Show how to change %3eval%* to handle %3label%*.
.SS(functional arguments,functional argument,P76:)
.BEGIN TURN ON "#","←";

Recall our discussion of :
%3eval[(F#2#3);((F#.(LAMBDA(X#Y)(PLUS#(EXPT#X#2)Y))))].%*
We now know this is equivalent to: 
%3eval[((LABEL#F#(LAMBDA(X#Y)(PLUS#(EXPT#X#2)#Y)))#2#3);NIL]%1.
In either case, the effect is to bind the name %3f%* to the λ-expression.
Binding is also going on when %3f%* is called: we bind %32%* to %3x%*, and %33%*
to %3y%*. Acknowledging Occam, why not attempt to combine the binding processes?
For example if %3 twice[x]%* is defined to be %3plus[x;x]%*, and %3foo[f;x]%*
were %3f[x]%*, then a truly expensive way of evaluating %3twice[2]%* would appear
to be %3foo[twice;2]%*.

.P81:
This will almost work; as usual %3foo%* would be considered a %2function%* by %3eval%*
and the arguments would be evaluated.  We don't want the %2value%* of %3twice%*, we
want the %2name%*. So to stop evaluation, we might try %3QUOTE%*-ing %3twice%* when
we call %3foo%*; thus %3(FOO#(QUOTE#TWICE)#2)%*. %3QUOTE%*-ing is not quite
good enough, as we will see in a moment, but it would work here. %3f%* in the definition 
of %3foo%* is called a  %2functional argument%*; %3f%* is a λ-variable, but appears
in the body of the function where we expect a function. You, of course, 
should realize by now that %2all%* function names appearing in our LISP expressions
are variables; hopefully they are bound to %2something%* (primitives or λ-expressions)
before we attempt to evaluate them.

Using Weizenbaum's environments we might get:
.BEGIN GROUP;TURN ON "\";NOFILL;TABS 10,13,23,28,31,46,51,54,59,62,65;

\%3foo[twice;2]\\twice[2]\\   x + x%*
\\E%40%*\\\E%41%*\\\E%42%*
\ /\| /\\ E%40%*\| E%40%*\\ E%41%*\| E%41%*
\______\=>\______\=>\______
%3         twice\\| λ[[x] x + x\\ x\| f\\ x\| 2
%3           foo\\| λ[[x;f] f[x]]\\ f\| "twice"

%1So we %2will%* get the right bindings.
.END

If such baroque examples were the limit of functional arguments, their usefulness
would be questionable, however they are both a powerful programming tool and their
implementation sheds much light on programming language design (see {yonss(P3)}).

In particular, most implementations  of LISP include a very useful class 
of ⊗→mapping functions↔←.

.BEGIN INDENT 0,10;
⊗→%3maplist%*↔← is a function of two arguments, a list, %3l%*, and a functional 
argument, %3fn%*. %3maplist%* applies the function %3fn%* (of one argument) 
to the list %3l%* and its %3cdr%*s until %3l%* is reduced to %3NIL%*.
The value of %3maplist%* is the list of the values returned by %3fn%*.
Here's a definition of %3maplist%*:

.END
.GROUP;

←%3maplist <= λ[[fn;l][null[l] → NIL; T → cons[fn[l];maplist[fn;cdr[l]]]]]%*.

Thus:

%3←maplist[REVERSE;(A B C D)] = ((D C B A)(D C B)(D C)(D))%*.
.APART;

.BEGIN INDENT 0,10;
⊗→%3mapcar%*↔← is a similar function, applying %3fn%* not to %3l%* but the
%3car%* of %3l%*. Thus:
.END
.GROUP
.P34:
←%3mapcar <= λ[[fn;l][null[l] → NIL; T → cons[fn[car[l]];mapcar[fn;cdr[l]]]]].%*

and an example:

←%3mapcar[(LAMBDA(X)(CONS X NIL));(A B C D)] = ((A)(B)(C)).%*
.APART

To see why %3QUOTE%* is not sufficient, consider the following strange example:

.BEGIN TABIT2(10,29);GROUP;
%3\tester <= λ[[l;fn]\[null[l] → NIL; 
\\ atom[l] → fn[ ];
\\ T → tester[car[l];(LAMBDA()(TESTER(CDR L) FN))]]]%*
.END

Notice %3fn%* is a functional argument and we are %3QUOTE%*-ing the actual
parameter bound to %3fn%*
in the recursive call on the function, %3tester%*.
Now if we call %3tester%* with: %3tester[(A#(B)#(C));(LAMBDA()(BAZ#(QUOTE#A))]%*
the following occurs:

%21.%*  %3l%* is bound to %3(A (B) (C))%*; %3fn%* is bound to 
%3(LAMBDA()(BAZ#(QUOTE#A)))%*.

%22%*.  In the conditional expression of %3tester%* p%41%* and p%42%* are false
and thus we evaluate the "otherwise" branch.

%23%*.  We rebind %3l%* to %3A%*, rebind %3fn%* to %3(LAMBDA()(TESTER(CDR#L)#FN))%*
and lose big.

%3null[l]%* is  false, but %3atom[l]%* is true so we evaluate %3fn%*.
When we attempt to evaluate %3cdr[l]%* we get the wrong binding of %3l%*.
What we find bound to %3l%* is %3A%*, rather than %3(A#(B)(C))%* which we 
expected.
The environment at the point of call in e%43%* of  the body of %3tester%* should 
have been associated
with %3fn%* when we rebound %3fn%*.  That is, we must associate the
symbol table which was current at the %2point of call%* with the functional argument.

We must now modify our use of the Weizenbaum environments to show this association.
When an assignment of the form %3f#<=#λ[[#...#]#...#]%* is performed we
will add the name %3f%* as usual but will now add %3λ[[#...#]#...#]#:%1E%4i%1
as value where E%4i%* is the environment at the point of assignment. When %3f%*
is %2called%* we set up a new environment as before, but the access environment
will now be set to E%4i%*.
This solution might seem a bit drastic, but it is easy to construct
counterexamples to less sophisticated  solutions.
For example, attempting to substitute value for the free variables at the time
  "<=" is done is not sufficient.
Consider the following example:

.BEGIN TURN ON "\";NOFILL;TABS 10,13,28,33,36,51,56,59,64,67,70;
\\%3g[3]\\\f[2]\\\x + y + z       ....%1
\\E%40%*\\\E%41%*\\\E%42%*     ...
\ /\| /\\E%40%*\| E%40%*\\E%41%*\| E%40%*    ....
\______\=>\______\=>\______    ....
\%3f\| λ[[x]x+y+z]:%1E%40%3\\y\| 3\\x\| 2
\g\| λ[[y]f[2]]
\z\| 4
.END
Notice that we could substitute the value for %3z%* when %3f%* is defined
but we cannot do so for %3y%*. Even if there were a value for %3y%* at this time
it would be the wrong value.


What does this say about functions?  We have already remarked that functions are
parametric values; to that we must add: functions are also tied to the environment
in which they were created; they cannot be  evaluated %3in vacuo%*.

The device LISP used to associate environments with functions is called
the ⊗→%3FUNARG%*↔← hack. (More is said about %3FUNARG%* in {yonss(P3)}.)
Instead of  designating an instance of a functional argument by the %3QUOTE%* 
operator, we introduce a new operator called ⊗→%3FUNCTION%*↔←. Thus:

←%3function[λ[[]tester[cdr[l];fn]]] %*or,

←%3(FUNCTION(LAMBDA()(TESTER (CDR L) FN))).%*

When %3eval%* sees  the construction %3(FUNCTION %1fn%3)%* it returns as value
the list:

←%3(FUNARG###%*fn### current-symbol-table%3)%*.

When %3eval%*, or actually %3apply%* sees %3(FUNARG %*fn st%3)%*, that is,
when we are %2calling%3 fn%1, we use the symbol table %3st%*, rather than
the current symbol table for accessing global variables.  

.<<pic of funarg execution>>
.P79:
Let's follow the behavior of an %3eval%* and %3apply%* family which
has been  modified to handle functional arguments.

Let %3fn <= λ[[x%41%*; ... ;x%4n%*] ...]%1 be a function which will be used
as a functional argument in the context, say:

.BEGIN SELECT 3; CENTERIT;GROUP;

←fie <= λ[[ ...;foo ...]  .... foo[ ...] ]
←fie[ ...; function[fn]; ...]

.END
in the sequel %3st%* and %3st%4save%1 will name symbol tables. "→" will mean
"points to" (i.e.,#%3cons%*). p%4i%*'s will be dotted pairs in a symbol table.
Then let's see what calling %3fie[#...;#function[fn];#...]%* will do.

.BEGIN TABIT2(10,60);GROUP

\%3(FIE ... (FUNCTION FN) ...)  st:\NIL    ⊗↓%3st%1 is not really empty.
Obviously it contains the function definitions.←
\         ....
\%1computation               %3st%1 :\p%4n%* → p%4n-1%* → ... p%41%*
\%3(FUNCTION FN)
\%1   gives:
\%3(FUNARG FN st%4save%*)
\         ....
\%1computation         %3st%1 : p%4m%* → p%4m-1%* ...→ 
\%3(FOO %1a%41%* ... a%4n%*)
\%1   gives:
\%3((FUNARG FN st%4save%*)%1a%41%* ... a%4n%3)

.BEGIN FILL;NARROW 0,40;
%1The a%4i%*'s are evaluated in the context %3st%* %2not%* %3st%4save%1 by
%3evlis[#...;#st]%* giving v%4i%1's.

We then apply %3fn%* to the v%4i%*'s in the context %3st%4save%1 %2not%*
in environment %3st%* by calling %3apply[#...#;#...#;st%4save%*].

%1This results in:

%3eval[ %1body of %3fn%*; %3((x%41%1 . v%41%3) → ... (x%4n%1 . v%4n%3)%1 → 

.END
.END
After we finish this inner call on %3apply[#...#;#...#;st%4save%*]%1 the table %3st%*
is restored. Notice that our symbol tables are now really tree-like rather than 
stack-like.
It should be apparent from %3eval%* that %3(QUOTE ...)%* 
and %3(FUNCTION ...)%* will have the
same effect if %3fn%* has no global (or free) variables.

This seems like a lot of work to allow a moderately obscure construct to appear in a language.
However constructs like functional arguments appear in most programming languages
under different guises. Usually the syntax of the language is sufficiently
obfuscatory that the true behavior  and implications of
devices  like functional arguments is misunderstood. Faulty implementations
usually result. In LISP the problem %2and the solution%* appear
with exceptional clarity.  

Functional arguments may be exploited to describe very general control structures.
We will say more about this application later.

Finally, here is a sketch of the abstract structure of the current %3eval%*.

.BEGIN SELECT 3;GROUP;TABIT1(12);

eval <= λ[[exp;environ]
\[isvar[exp] → value[exp;environ];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfun[exp] → makeclosure[exp;environ];
\ isfunc+args[exp] → apply[func[exp];list_of_evaled_args[exp;environ];environ]]
.APART

.GROUP
%1where:%*
apply <= λ[[fn;args,environ]
\[isfunname[fn] →  ....
\ islambda[fn] → eval[body[fn];newenviron[vars[fn];args;environ]]
\ isclosure[fn] → apply[func%41%*[fn];args;evn[fn]]

\      ....          ..... ]]

.APART
.END

←%2Problems%*

I.  What changes should be made to the LISP syntax equations to
allow functional arguments.

II. Use %3foo%* on {yon(P81)} to define a function which computes factorial without
using %3label%* or explicit calls on the evaluator.

III. Extend %3eval%* and friends to functional arguments.
.END
.SS(special forms,special form,P8:)
We have remarked that the evaluation scheme for LISP functions is
call-by-value and, for functions with multiple arguments, left-to-right
evaluation of arguments. We have also seen, in %3QUOTE%* and %3COND%*,
that not all forms to be evaluated in LISP fall within this category.
We have already noted on {yon(P44)} that %3QUOTE%* and %3COND%* are
%2not%* translates of functions. Clearly we don't evaluate the arguments to
%3(QUOTE ...)%*; indeed the purpose of %3QUOTE%* was to %2stop%* evaluation.
Also the `arguments' to %3COND%* are handled differently; their evaluation was
handled by %3evcond%*.
We therefore have called %3QUOTE%* and %3COND%* %2special forms%*.
We would like to discuss special forms as a generally useful technique.

Consider the predicates, ⊗→%3and%*↔← and ⊗→%3or%*↔←. We might wish to define
%3and%* to be a binary predicate such that %3and%* is true just in case
%2both%* arguments evaluate to %3T%*; and define %3or%* to be binary
and false just in case both arguments evaluate to %3NIL%*.
Notice two points. First, there is really no reason to restrict these predicates
to be %2binary%*. Replacing the words "binary" by "n-ary" and "both" by "all"
in the above description has the desired effect.
Second, if we evaluate the arguments to these predicates in some order,
say left-to-right, then we could immediately determine that %3and%*
is false as soon as we come across an argument which  
evaluates to %3NIL%*; similarly a call on %3or%* for an arbitrary number
of arguments can be terminated as soon as we evaluate an argument giving
value %3T%*.

But if we insist that %3and%* and %3or%* are %2functions%* we can 
take advantage of neither of these observations. Functions have a fixed
number of arguments, all of which are evaluated. The solution should be
clear: define %3and%* and %3or%* as special forms and handle
the evaluation ourselves. Presently, the
only way to handle special forms is to make explicit modifications
to %3eval%*.  This is not terribly difficult (or desirable). When
we see a more realistic version of %3eval%* and Co. in {yonss(P30)}, we will have
a simple way to add such forms. See also {yon(P54)}.

Recognizers for the predicates must be added to %3eval%*:

.BEGIN CENTERIT;SELECT 3;
←eq[car[e];AND] → evand[cdr[e];a];
←eq[car[e];OR] → evor[cdr[e];a];
←.....%1      where:
.END

.BEGIN SELECT 3;TABIT1(16);

.P53:
evand <= λ[[l;a]\[null[l]→ T;
\ eval[car[l];a] → evand[cdr[l];a];
\ T → NIL] ]

evor <= λ[[l;a]\[null[l] → NIL;
\ eval[car[l];a] → T;
\ T → evor[cdr[l];a]] ]

.END

Notice the explicit calls on %3eval%*. This is expensive, but cannot be helped.
Later we will show a less costly way to handle those "non-functions" which
only have an indefinite number of arguments, all of which are
to be evaluated (see {yonss(P18)} on macros).


←%2Problems%*

I  What is the difference between a special form and call-by-name? Can call-by-name
be done in LISP (without redefining %3eval%*)?

.P71:
II %3select%* is a special form to be called as: 
%3select[q;q%41%*;e%41%*;#...#;q%4n%*;e%4n%*;e]%1 and to be evaluated as follows:
%3q%* is evaluated; the %3q%4i%1's are evaluated from left to right until
one is found  with the value of %3q%*. The value of %3select%* is the value
of the corresponding %3e%4i%1. If no such %3q%4i%1 is found the value of
%3select%* is the value of %3e%1. %3select%* is a precursor of the 
%2⊗→case statement↔←%*. See {yon(P70)}.
Add a recognizer to %3eval%* to handle %3select%* and write a function to
perform the evaluation of %3select%*. 


.SS(The %3prog%*-feature,,P37:)
.TURN ON "←";
%1

Recursion seems a bit supernatural, but it is legal, mechanizable
and rather cool.
There is another similar ⊗→bind↔←ing process occurring in LISP.  It is
connected with an iterative style of LISP programming called the
⊗→%3prog%*↔←-feature.

First a general disclaimer: Some pure ("For my next trick I'll walk
on the water") LISP programmers feel that there is something slightly
obscene about writing iterative style programs.  This isn't quite
true, but there are some good reasons for emphasizing recursion.

%21.%* Anyone can write an iterative scheme.  Recursion is a powerful
tool  and very possibly a new programming technique
for you.  You should exercise your skill and resort to the %3prog%*
as a last resort.

%22.%* Two of the usual trappings of iteration are %2labels%* and %2go-to%*'s. 
These are truly tools of the devil.  In recent years several studies
by reasonable men have shown that algorithms which resort to labels
and gotos tend to be harder to read, harder to modify, sloppy in their
analysis of the process being coded, and generally ugly.  The label
and goto hack invites patching over poor analysis instead of 
reanalyzing the problem.

%23.%* With the control of the loop structure (either recursion or some
kind of controlled iteration, e.g., a FOR or WHILE statement) in the
hands of the language rather than in the hands of the programmer, the
static text and the dynamic flow of the execution have a parallel
relationship. This should have a beneficial effect for studies 
investigating the provability of properties of programs.

Now that this has been said, here's our discussion of %3prog%*s.

Many algorithms present themselves more naturally as iterative
schemes.  Recall the recursive computation of the length of a list given 
on {yon(P19)}. %3length%* seems inordinately complex; our sympathies
lie more with %3length%41%1.  Even this is not as straightforward as you would
expect for such a simple calculation.  Rather, consider the following:

%21.%*  Set a pointer to the given list.  Set a counter to zero.

%22.%*  If the list is empty, return as value of the computation, the 
     current value in the counter.

%23.%*  Otherwise, increment the counter by one.

%24.%*  Set the pointer to the cdr of the list.

%25.%*  Go to line 2.


The new ingredients here are:

%21.%*  An ⊗→assignment↔← statement: "set a ...".

%22.%*  Access to some new cells: the pointer, the counter.

%23.%*  Sequencing (albeit usually implied) between statements: line %21%*, then line %22%*...

%24.%*  Gotos and labels.

%25.%*  An explicit exit from the procedure: line %22%*.

Here is the LISP %3prog%* version of the length function:

.BEGIN TABIT2(15,17);SELECT 3;GROUP;TURN OFF "←";

.P45:
⊗→length↔← <= λ[[x]prog[[l;c]
\\l ← x;
\\c ← 0;
\a\[null[l] → return[c]];
\\l ← cdr[l];
\\c ← c+1;
\\go[a]] ]

.END

A few points should be made: The ⊗→%3prog%*-variables↔←, %3l%* and %3c%1, are ⊗→local variable↔←s.
That is, they only have meaning within this definition of %3length%*. If they
appeared in some program which called %3length%*, then the right thing
would happen; the old values would be saved (like λ-bindings) and then
restored after the call on %3length%* has been completed. Among other things,
this allows %3prog%*s to be used recursively.

Though assignment statements are normally executed for their effect on the
environment -- they have side-effects, assignments in LISP also have a value.
The value of an assignment is the value of the right-hand-side.

Conditional expressions have a slightly different effect inside %3prog%*s. If
none of the predicates in the conditional are true, then the statement following
the conditional is executed. 

.P83:
Labels are clear; they are identifiers. Labels are local, that is must be found
within the body of the current %3prog%*. Labels may conflict with the 
λ-variables or %3prog%*-variables; the evaluator for %3prog%*s can resolve the
conflicts by context.

⊗→%3go%*↔← is a little more complicated. %3go%* is a special form. If the argument
to %3go%* is %2atomic%* then it is interpreted as a (local) label; otherwise, the argument
is %2evaluated%* and the result of the evaluation is interpreted as a label.
This results in a useful programming trick.  Let %3l%* be a list of dotted pairs,
each of the form, %3(%* object%4i%* . label%4i%3)%1. At each label%4i%* we
begin a piece of program to be executed when object%4i%* has been recognized.
Then the construct:

.P25:
←%3go[cdr[assoc[x;l]]]%*           (*1*),

can be used to "dispatch" to the  appropriate code when %3x%* is one of the
object%4i%*. This is an instance of %2⊗→table-driven↔←%* programming.
This programming trick shows "go-to" programming in its most pervasive form.
The blocks of code dispatched to can be distributed throughout the body of the
%3prog%*. Each block of code will usually be followed by a %3go%* back to
the code involving equation *1* (above). In fact the argument %3l%* in *1*
may be %2global%* to the %3prog%*-body.
The general effect is to make a %3prog%* which is very difficult to understand.
The LISP %3select%* ({yon(P71)}) will handle many of the possible applications 
of this coding trick and result in a more readable program. The case-statement ({yon(P70)}) 
present in  some other languages is also a better means of handling this problem.

The ⊗→%3return%*↔← statement evaluates its argument, and returns that value as
the value of the %3prog%*.

When a %3prog%* is entered the %3prog%*- (local) variables are initialized to %3NIL%*.
The body of the %3prog%* is a sequence of statements and labels. Statements are
executed sequentially unless a %3go%* or %3return%* is evaluated. 
If a %3prog%* runs out of statements then %3NIL%* is returned.
Returning from a %3prog%* unbinds the local variables. 

Now to the problem of translating %3prog%*s into a s-expression representation.
We need be a little careful. First notice that %3prog%* is a %2⊗→special form↔←%*
(like %3COND%* and %3QUOTE%*). The construct:

←%3prog[[l;c]....]%*  will be translated to:

←%3(PROG(L C) ...)%*. 

But %3PROG%* is not the translate of a function
any more than %3(QUOTE ...)%* or %3(COND ...)%* is.
So the body of the %3prog%* must be handled specially by a
new  piece of  the evaluator. 

.TURN OFF "←";

Similarly we must be careful about the interpretation of ←. First,
we will write   %3x ← y%*  in prefix form: %3←[x;y]%*. Now, using our usual LISP
technique we might map this to:
.BEGIN CENTER
%3(SET X Y).%*   Is ← or ⊗→%3SET%*↔← a function in the usual LISP sense?
.END

Not likely; for if %3x%* and %3y%* have values %32%* and %33%*, for example, then the
call-by-value interpretation of %3←[x;y]%* would say %3←[2;3]%*. This was not our
intention, hopefully.  What do we want?  We want to evaluate the second argument
to ← while stopping the evaluation of the first argument. But LISP does
have a device for stopping evaluation: ⊗→%3QUOTE%*↔←. So we can define %3SET%* as a normal
LISP function, provided we  

.BEGIN CENTER
translate	%3x ← y%*    as	%3(SET (QUOTE X) Y)%*.
.END

.TURN ON "←";

For example, look at the evaluation of %3(SET (QUOTE X)(PLUS X 1)).%*
Assume the current value of %3X%* is %32.
SET%* is a function; evaluate the arguments from left to right. The 
value of %3(QUOTE X)%* is %3X%*; the value of %3(PLUS X 1)%* is %33%*; assign %33%* to %3X%*.

Question: what does %3(SET Z (PLUS X 1))%* mean?  Well if the current value of
variable %3z%* (represented by %3Z%*) is an identifier (non-numeric atom), then
%3(SET Z (PLUS X 1))%* makes sense.  Assume the current value of %3Z%* is %3A%*, then
the effect of the %3SET%* statement is to assign the value %33%* to %3A%*.

Normally when you are making assignments, you want to assign to a %2name%* and
not a %2value%*; thus you will be using the form

←%3(SET (QUOTE ...) ...).%*

As a convenience an abbreviation, ⊗→%3SETQ%*↔←, has been made available:

←%3(SETQ ...  ... )%*  means %3 (SET (QUOTE ...) ...).%*

Again note that %3SETQ%* is not  (the translate of) a function. It may be
defined as a special form or consider as a notational convenience like %3list%*.

Here is a translation of the body of the %3prog%* version of %3length:%*
.GROUP
%3
.BEGIN TABIT3(10,17,20);
\(LAMBDA(X)
\\(PROG(L C)
\\\(SETQ L X)
\\\(SETQ C 0)
\\A\(COND ((NULL L)(RETURN C)))
\\\(SETQ L(CDR L))
\\\(SETQ C (ADD1 C))
\\\(GO A) ))

.END
.APART
%1
.REQUIRE "PROB7.PUB" SOURCE_FILE;
.SS(In retrospect)
We will begin with a sketch of the LISP evaluator as it would now appear:
basic %3eval%* plus the additional artifacts for %3label, function%*, and %3prog%*.
This evaluation process is very important and, though it is perhaps new material,
should be appear quite intuitive in retrospect. 
%3eval%* and friends are not to be  memorized. If
you cannot write functions equivalent to %3eval%*, then you have not understood
LISP evaluation.

Finally, we will examine some of the weaknesses present in LISP. 
There are alternatives to some of the LISP techniques and there are some things
which, in retrospect, LISP could have done better.


First, the basic evaluator. 
There are two arguments to %3eval%*: a %2form%* ⊗↓throughout this section we
will say "form", "variable", "λ-expression", etc.  rather than "an S-expression
representation of a" ... "form", "variable", "λ-expression", etc. No
confusion should result, but remember that we %2are%* speaking imprecisely.←, that
is an expression which can be evaluated; and second, an association list or
%2symbol table%*. If the form is a number, the atom %3T%* or %3NIL%*, return that
form. If the form is a variable, find the value of the variable in the current
table or environment.  If the form is a (non numeric) S-expression constant, return 
that constant. If the form is a conditional expression, then evaluate it
according to the semantics of conditional expressions. If the form is a
%3prog%*, evaluate the body of the %3prog%* after binding the local %3prog%* variables
to %3NIL%*⊗↓Perhaps you can now see why quoted expressions, conditional expressions,
and %3prog%*s are called %2special forms%*.←.
The form might also be a functional argument. In this case evaluation consists of
forming the ⊗→closure↔← of the function, i.e., associating the current environment 
with the function. In LISP, this is done with the %3FUNARG%* device.
Any other form seen by %3eval%* is assumed to be a function followed by arguments.
The arguments are evaluated from left-to-right and the function is then applied
to these arguments. 

The part of the evaluator which handles function application is called %3apply%*.
%3apply%* takes three arguments: a function, a list of evaluated arguments, and
the current symbol table. If the function is one of the five LISP primitives
then the appropriate action is carried out. If the function is a λ-expression
then bind the formal parameters (the λ-variables) to the evaluated arguments and
evaluate the body of the function. The function might also be the result of a
functional argument binding; in this case apply the function to the arguments
in the saved environment rather than in the current environment.
We might also be applying the label operator. All we need do here is apply the
function definition to the arguments in the current context augmented by the 
binding of the function name to its definition.

We have saved what seems to be the simplest for last. That is the function 
is named; what we do is look up the name of the function in the current environment.
To look something up means to find its value.
Currently we expect that value to be a λ-expression, which is then applied. 
However, since function names are just variables, there is no reason that the
value of a function name could not be a simple value, say an atom, and 
%2that%* value
applied to the arguments, etc.  Indeed examination of %3apply%* on {yon(P69)}
will show that %3apply[X;#((A B))#;#((X#.#CAR)#...#)]%* %2will%* be handled correctly.
The obvious extension of this idea is to allow a %2⊗→computed function↔←%*. That
is, if the function passed to apply is not recognized as one of the preceding
cases, then evaluate the expresssion until it is recognized. Thus
we will allow such forms as:
.BEGIN CENTERIT;SELECT 3;
←((CAR#(QUOTE#(CAR#(A#.#B))))#(QUOTE#(A#.#B)))
.END

What conceptual difficulties are present in LISP evaluation?
One of the more important defects in LISP is its inadequate handling of function
valued variables, functional arguments being a particular case. 
LISP was the first language to handle functional arguments so it is not too suprising
that all is not quite right.

The %3FUNARG%* device is sufficient for simple uses of functional arguments and
closures. However, we should like to return functions and closures as %2values%*.
Returning open functions is easy; we can simple %3cons%*-up λ-expressions.
Returning closures is more difficult; {yonss(P78)} begins to discuss this problem.

****more***

.BEGIN TURN ON "←";
%2Problems%*

I  Write complete versions of %3eval%* and %3apply%*.
.END
.SS(Towards a Mathematical Semantics)
In {yonss(*)} we informally introduced some ideas about proving properties
of programs. We would now like to expand on this idea of introducing
mathematical rigor into the discussion of programming languages. Though firm
technical basis can be extablished for the following discussion, we 
wish to proceed at an intuitive level and hopefully arouse curiosity 
without damaging the underlying theory.

First, why worry about foundational and theoretical work at all? Programmning,
it is said, is an art and art needs no formalizing. Indeed, but there is some
science in Computer Science and %2that%* part can be analyzed. 

The description of LISP which we have given so far is classified as ⊗→operational↔←.
That means our informal description of LISP and our later description of the LISP
interpreter are couched in terms of "how does it work (operate)". Indeed
the whole purpose of %3eval%* was to describe explicitly what is to happen
when a LISP expression is to be evaluated.  We have seen *** that discussion
of evaluation schemes is non-trivial; and that order of evaluation can effect the
outcome ***.  One difficulty with the opertational approach is that it (frequently)
specifies too much. Many time the order of evaluation is immaterial.
There is another way of looking at LISP expressions called ⊗→denotational↔← semantics.
We could say that a LISP form %2denotes%* some sexpression (or is undefined) just
as we could say that 2+2 denotes 4. The scheme which we use to evaluate the expression
is irrelevant; there is some object which our expression denotes and the process
which we use to discover that object is irrelevant. Mathematically speaking then
we have an abstract domain and expressions in our language denote objects in
our domain; the order of evaluation, indeed the %2process%* of evaluation, does not
concern us.

Let us begin to relate this discussion to LISP. Clearly, our domain will include
the S-expressions since %2most%* LISP expressions %2denote%* Sexprs. However, we
must include more; many of our functions are partial functions. Our domain
must then include a denotation for %3undefined%*. We will use the symbol %8U%*.

.SEC(Running on the machine)
%1
This section is for the brave few who wish to run on a machine.
The %2programming language%*, LISP, is the Sexpr translation
of the LISP functions and %3prog%*s that we have been writing. What are 
some of the implications of writing in Sexpr form?

First, LISP becomes the world's largest parenthesis sink.  It makes
for difficult reading at times, but there are formatting tricks, and
editing programs which will help the user match parens and locate
parts of the program.  (It only hurts for a little while).  There is
a practical benefit which greatly outweighs this anguish factor:
since proper input to LISP programs are Sexprs and since we are writing
LISP programs in Sexpr form then on input, data and program are 
indistinguishable in format; i.e., the are both binary trees. 
Obviously for evaluation, programs must have a very special structure,
but program and data are both trees just as in more hardware machines 
the contents of locations  which contain data are indistinguishable
in format from locations which contain instructions.

On  the bare machine it is the way in which a location is accessed
which determines how the contents of that location are interpreted.
If the central processor accesses the location via the program 
counter, the contents are assumed to be an instruction.  That same
location can usually be accessed as part of a data manipulating 
instruction.  In that case, the contents are simply taken to be data.
LISP is one of the few high-level languages which also has this property.
It gives the programmer exceptional power.  Let's complete the
analogy.  The central processor here is our old friend, ⊗→%3eval%*↔←.
If %3eval%* references a binary tree via its `program counter', then 
that tree is decoded via the internals of %3eval%*.  If a tree is
referenced as input to an Sexpr-massaging function, then it is
taken as data.  As a result, a LISP program can %3cons%*-up a new
Sexpr which can then be evaluated (i.e., interpreted as an intruction).
The simplest way to communicate with such a machine is to read an
sexpr translate of a LISP expression into memory, evaluate the expression, and
print the result. Several implementations of LISP use a slight variant
of this called the "read-eval-print" loop:

←%3prog[[]##a##print[eval[read[];NIL]];go[a]]%*.

.BEGIN TURN ON "#";
Since programming is done using the sexpr translation of LISP functions
it becomes convenient to sometimes say "...#the function %3CAR%*#..." or
"...#write a %3PROG%*#...",#... when we actually mean %3car%* and %3prog%*,#....
The actual LISP functions are written in the language generated by syntax equations
which we have been  including. This language is called the meta-language, and
the LISP expressions called ⊗→M-expressions↔← or M-exprs.
The distinction between Mexprs and their Sexpr translations must not be forgotten.
.END

Though %3eval%* is the equivalent of the basic Cpu of the ⊗→LISP machine↔←,
there is another artifact in many versions of LISP to communicate
with the outside world.  As with most pieces of I/O equipment,
it leaves something to be desired.  Its name is %3evalquote%*.

.SS(%3evalquote%*,%3evalquote%*)
The basic structure of the  %3evalquote%*-loop is:

%21.%*  read in a (Sexpr representation of) function . I.e., a name
	or a lambda expression.

%22.%*  read in a list of (evaluated) arguments to be applied to the function.

%23.%*  apply the function (of step %21.%*) to the argument list.

%24.%*  print the value.

%25.%*  go to step %21.%*

Thus the structure of the loop  is essentially:

.BEGIN TABIT2(24,27);TURN OFF "←";
%3
.GROUP
                   prog[[fn;x;z]
\l\fn ← read[ ];
\\x ← read[ ];
\\z ← evalquote[fn;x];
\\print[z];
\\go[l]]]

%1where: %3evalquote <= λ[[fn;x]apply[fn;x;NIL]]
.APART
%*

or more concisely:

%3
.GROUP
\prog[[ ]
\ l    print[evalquote[read[ ];read[ ]]];
\      go[l]]
.APART

%*
.END
.GROUP
where:

%21.%*  the function, ⊗→%3read%*↔←, negotiates with an input device to read an
Sexpression.

%22.%*  the function, ⊗→%3print%*↔←, prints the value of its argument on an output
device.
.APART

The details of %3read%* and %3print%* will appear when we discuss implementation
of LISP.

.BEGIN TURN ON "#";
Here's an example of the behavior of %3evalquote%*. Assume that the input
device contains %2CAR#((A#B))%*. Then the first %3read%* in %3evalquote%* gets
%2CAR%* (a string representing an Sexpr), and the second read gets %2((A#B))%*; 
then %3apply%* gets called as:
%3

←apply[CAR;((A B));NIL].
%*

%3apply%* says that this is the same as evaluating
%3car[(A#B)]%*, and thus returns %3A%* as its answer, which is dutifully
printed by %3print%*.  
.END

So %3evalquote%* can be used as a `desk calculator' for LISP. 
If we wish to evaluate an expression %3f[a%41%*;#...#a%4n%*]%1 for
%3a%4i%1 constants, i.e. sexprs, then %3evalquote[F;#(a%41%*#...#a%4n%*)]%1
will do it for us.  But the %3evalquote%*-loop will %2not%* evaluate arguments;
the a%4i%1's in the call on %3evalquote%* are not expressions, not translates
of constants, but are constants.
Typing %2CONS#((CAR#(QUOTE#(A)))#(QUOTE#(B)))%* to the %3evalquote%*-loop
will result in %3((CAR#(QUOTE#(A)))#.#(QUOTE#(B)))%*.

The %3evalquote%*-loop is an anomaly. It does not expect the usual representation
of a LISP form say %3(F e%41%* ... e%4n%*)%1  where the %3e%4i%1's 
are to be evaluated. It expects a %2pair%* of sexprs representing a function
and %2evaluated%* arguments. The benefits of having "functional-notation" as
input and having arguments implicitly evaluated are greatly outweighed
by the confusion introduced by having two formats for LISP expressions, one for
the "top-level" and a different one everywhere else.


Certainly
before we can do any reasonable amount of calculation, we must be 
able to define and name our own functions. The %3label%* operator, or
 calling %3eval%* with an intial symbol table containing
our defintions will suffice, but this is not particularly elegant.
We would like our definitions to have some permanence.

A function (actually a %2special form%*, if you have been paying attention)
named ⊗→%3define%*↔←, whose effect is to add definitions to %3apply%*, has been 
provided. The actual behavior of %3define%* will appear in the sections on
implementation.

.P51:
%3define%* is the name of a special form (its arguments are not evaluated)
of one argument, say %3x. x %*is a list of pairs: %3((u%41%* v%41%*) ... (u%4n%* v%4n%*)).%1
Each %3u%4i%1 is the name of a function and each %3v%4i%1 is the λ-expression
definition.  Actually each %3u%4i%1 and %3v%4i%1 is the %2Sexpr translation%* of the
name and the function, but you knew that anyway.  The effect of %3define%*
is to put the appropriate definitions in the symbol table.

For example we might wish to define the function which reverses a list as:
%3
.BEGIN NOFILL;TURN OFF "←";
.GROUP

reverse <= λ[[x] prog[[y;z]
                       y ← x;
                       z ← NIL;
                     a [null[y] → return [z]];
                       z ← cons[car[y];z];
                       y ← cdr[y];
                       go[a];]]
%1
.APART

Then the following would make this definition grist for the %3evalquote%* mill.
.GROUP

%3
DEFINE((
        (REVERSE (LAMBDA (X)
                         (PROG (Y Z)
                           (SETQ Y X)
                           (SETQ Z NIL)
                         A(COND ((NULL Y)(RETURN Z)))
                           (SETQ Z (CONS (CAR Y) Z))
                           (SETQ Y (CDR Y))
                           (GO A) )))
                                       ))

%1
.APART
and subsequently:

%3REVERSE ((A B C))  %1would evaluate to: %3(C B A).%1

.END

%1
C. Weissman's LISP PRIMER is an excellent source of machine examples
and should be consulted now. Always remember that you are writing the sexpr representation
of LISP functions and expressions. Do not confuse the two.
.SS(A project: extensions to %3eval%*,%3eval%*,P67:)
.SELECT 1;

***h.s. about how wonderful LISP is for extenson***

.BEGIN TURN ON "#";
Consider a p%4i%* → e%4i%* component of a conditional expression.  As
conditionals are now defined, e%4i%* must be a single expression to be evaluated.
In subsets of LISP without side-effects this is sufficient. It is, however,
convenient in practice, i.e., in LISP %2with%* side effects, to extend conditionals to include components of the
form: p%4i%*#→#e%4i1%*; ... e%4in%*.
This extended  component is to be evaluated as follows: if p%4i%* is true, then
evaluate the e%4ij%*'s  from left to right, with the value of the component to be
the value of e%4in%*.

For example, this feature, used in %3prog%*s would allow us to replace:

.BEGIN TABIT2(10,14);

\\ ....
\\[p%41%* → %3go[l]%1]
\\ ...
\%3l\%1e%41%*;
\\e%42%*;
\\ ...
\\%3go[m];
.END
with:  [p%41%* → e%41%*;e%42%*; ... %3go[m]%*;]. This is certainly easier to read.
This extended conditional expression is available on versions of LISP 1.6 on the PDP-10.


Here is another cosmetic. Instead of requiring that the body of a λ-definition
be a single expression: %3λ[[#...#]f[#...#]]%*, allow bodies of the form:
%3λ[[#...#]f%41%*[#...#];#...#f%4n%*[#...#]]%1. The evaluation of such 
should mean; bind as usual, evaluate the %3f%4i%1's from left to right, returning
as value %3f%4n%*[#...#]%1.

This next extension to %3eval%*  was derived from the syntax of ⊗→MUDDLE↔←, ⊗→CONNIVER↔←, and
⊗→MICRO-PLANNER↔←.
We have seen that LISP calling sequences are of two varieties: functions calls,
evaluating %2all%* of the arguments; and special forms, evaluating %2none%* of the
arguments.

In an attempt to generalize this regime we might allow the evaluation of some
of the arguments and enlarge on the domain
of objects which can appear in the list of λ-variables.
We might partition the formal parameters into obligatory parameters, optional 
parameters, and an excess collector.
Obligatory parameters %2must%* have corresponding actual parameters; optional
actual parameters are used if present, otherwise declared default 
values are used; and
if there are more actual parameters than the formals encompassed by the first
two classes, then they are associated with the excess collector.

To be more precise consider the following possible BNF equations:

.BEGIN TABIT1(13);TURN OFF "←";

<varlist>\::=[<obligatory> <optional> <excess>]

<obligatory>\::= <par>; ...<par> | %cε%*

<optional>\::= "optional" <opn>; ... <opn> | %cε%*

<excess>\::= "excess" <par> | %cε%*

<par>\::= <variable> | @<variable>

<opn>\::= <par> | <par> ← <form>

.END
The associated semantics follows:

.BEGIN INDENT 0,10;TURN OFF "←";
%21.%*  The formal parameters are to be bound to the actual parameters from
left to right (as usual).

%22.%*  There must be an actual parameter for each obligatory parameter, and
if there is no excess collector there may not be more actual parameters than
formals. (There may be less if we have optionals.)

%23.%*  If a <variable> in a formal parameter is preceeded by a "@", then
the corresponding actual parameter is %2not%* evaluated.

%24.%*  We might run out of actual parameters while binding the optionals.
If we do, then we look at the remaining formal optionals.
If a formal is simply a <par> then bind it to %3NIL%*; if a formal is
@<variable> ← <form> then bind the <variable> to the <form>; or if the formal is
<variable> ← <form>, bind <variable> to the value of <form>.

%25.%*  Finally, the excess collector is bound to a list of any remaining
actual parameters in the obvious way: if <par> is <variable> then form a list
of the values of the remainder; if it is @<variable>, bind <variable> to the
actual list. If there is no excess, bind to %3NIL%*.
.END

.BEGIN TURN OFF "←";
These same languages have extended %3prog%*-variables slightly, allowing 
them to be initialized explicitly. If a %3prog%*-variable is atomic,
intialize it to %3NIL%*, as usual. If it is of the form <variable>#←#<form>
then initialize it to the value of the <form>.
.END
.END

.BEGIN TURN OFF "←";turn on "\"; tabs 10,20;
Here are some examples:

1. In the initialization of %3length%* on {yon(P45)}, we could write:
%3...#prog[[l#←#x;#c#←#0]#....%*

2. %3list%* could now be defined as: %3λ[[%1"excess"%* x]x]%*.

3. Consider the following definition:
.BEGIN NOFILL;
.group

\%3baz <= λ[\[x;%1@%*y;%1"optional"%* z; u ← 0; %1"excess"%* v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].
.APART
%1Then a call of:
%3eval[(BAZ 2 (CAR (QUOTE (X Y)) 4 5 6 7 (CAR (QUOTE (A . B))));NIL]%* would print:

%3 
2  
(CAR(QUOTE (X Y)) 
4
5 
(6 7 A)%*
(and return value: %3(6 7 A)%*.

Similarly defining;

.GROUP
\%3fii <= λ[\[x;y;%1"optional"%* z; u ← 0; %1"excess"%3 v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].

%1and calling:
%3eval[(FII 2 (CAR (QUOTE (X Y)));NIL]%* prints:
%3 
2 
X 
NIL 
0 
NIL.%*
.END
.END

←%2Problems%*

Design simple sexpr representations of these proposed constructs.
Make these extensions to %3eval%*.  How useful are these syntactic
sugarings?


.SS(A project: Syntax-directed processes,syntax-directed)
%1
.REQUIRE "NOTES.SDP" SOURCE_FILE;
.SEC(Toward better symbol tables and implementation,symbol tables)
.TURN ON "←","α";
%1
*******diatribe about importance of LISP for implementors****

There are some rather gross defects in our symbol table mechanism.
First, (disregarding %3define%*) we have had to resort to subterfuge
to put the definitions of our functions in our symbol table.
Using %3label%* ({yonss(P38)}), or calling %3eval%* with an initial
symbol table, only defines the function for %2that%* particular call
on %3eval%*. When we finish the computation, the definition disappears.
From a practical point of view definitions should have some permanence.
An implemented version of LISP should know a
lot more functions than the five primitives.  It should have a 
reasonable class of arithmetic functions, and some commonly used
Sexpr functions (like %3length, assoc, subst, equal,%* etc.) all predefined.
If we do things right we should be able to use the mechanism for 
predefining functions as the tool for adding our own definitions.

Second, the table look-up mechanism of %3assoc%*, though theoretically
sound, leaves something to be desired as far as efficiency is
concerned. Since we do wish to implement LISP, we should be concerned with
efficient operations. We will see that an efficient and powerful symbol table structure
can be described quite easily in LISP.

Third, we have already seen that special forms %3QUOTE%* and %3COND%* are
not evaluated in call-by-value style. 
Currently the recognizers for special forms are encoded in %3eval%*, and
when an instance of such a form is seen the argument is passed %2unevaluated%*
to a function to perform the required operations.
It would be nice to extend this flexibility
to the user. We would like to have notation for adding λ-definitions of 
special forms to our symbol table. 
%3eval%* would then  need a way of distinguishing a 
λ-expression defining a function from a  λ-expression defining a special form.
Also there are other calling sequences which are interesting and useful and
would be nice to have. 
The current version of %3eval%* only handles function definitions. Thus
the current symbol need only store the λ-definition and does not
need explicit information saying it is a %2function%* definition.

Fourth, conceptually we could use a more powerful mechanism.  Consider
%3car[car]%* as an expression to be evaluated. Perhaps it is best just
to say the expression is ill-formed, but if you knew that %3car%* also
was currently being used as a simple variable besides being a function,
then your intuition says the first occurrence of %3car%* is a function 
name and the second occurrence is the simple variable name; and if the 
current value of %3car%* was say, %3(A B)%* then %3car[car]%* should evaluate to %3A%*.
That is, context tells us which occurrence of %3car%* is a function
and which is a simple variable ⊗↓just as an evaluator for %3prog%* can
tell a reference to the label %3x%* from the %3prog%*-variable %3x%* in
%3#...#prog[[x;y]#..#x#f[..]#...#g[x#..#]#...#go[x].%* See {yon(P83)}.←.
The current %3eval%* %2will%* operate correctly on this example since
a recognizer for the function %3car%* is explicitly encoded in %3apply%*.
⊗↓For the same reason, the LISP obscenity %3λ[[lambda] ... ]%* will work.
Notice its sexpr representation is %3(LAMBDA#(LAMBDA)# ...#)%*←.
However, in general our current symbol table mechanism is not sufficient for
this interpretation. Forced to use %3assoc%*,
we would only find the %2first%* binding of a variable. It will either
be a λ-expression (function) or a simple value. Context will determine
how %3eval%* will interpret what is found.
Clearly what is needed is the ability to 
associate more than one property with an atom. In the above example
%3car%* has (at least) two %2properties%* or %2attributes%*.  It has a function
value and it has a simple value. Since %3car%* is a primitive function,
its function value is the location of the machine code to carry out
the %3car%* function; the simple value is the sexpr currently bound to
the variable, %3car%*. 
.TURN OFF "α";

These last three desires, for permanence, generality, and for multiple properties,
can be handled by the same mechanism. Clearly we could continue using an
%3assoc%*-like table, though more complicated, storing information
about what kind of value each dotted pair is. However our desire for
efficiency would certainly not be satisfied. Rather, we will
design a new symbol table which can store with each atom sequences of values and 
their types. This gives us multiple properties.
We will make the table %2global%* 
to %3eval%* and %3apply%*; these functions will now implicitly refer to that table 
and thus need not have an explicit symbol-table argument. This
gives us permanence. 
We will designate  an indicator for function definition and an indicator
for special form definition and expand %3eval%* and %3apply%* to recognize
these indicators.  This gives us generality.

Most implementations of LISP allow this association of 
arbitrary sequences of  %6attribute-value%* pairs with an atom.
It is a very good idea.
How can we associate an arbitrary collection of objects  with
something?   With each atom we will
associate a list called the %6⊗→property-list↔←%* or p%6-list.%*  But atoms
can't be represented as just any other list.  Among other things,
we must be able to recognize the occurrence of atoms so that we can
implement the predicate, %3atom%*.  So atoms are special lists: the
%3car%* (or first element) of the representation of each atom is an 
indicator used exclusively for the beginning of atoms. 
We will use %7α%* to designate the beginning of an atom.
The %3cdr%* of the representation of the atom is the property list. 
The elements of the p-list are associated in pairs. The first element is
the attribute (or property or indicator); the next element is the value.
For example, here
is part of the atom-structure for %3car%*  assuming %3car%* is also being
used as a simple variable and has current value, %3(A B)%*:


.group
.GROUP SKIP 6;
.P50:
←%2Atom-structure for %3CAR%1.
.apart

The indicator, ⊗→%3SUBR%*↔←, tells us that %3car%* is a machine
language function.  The indicator, ⊗→%3VALUE%*↔←, tells us that %3car%* is
also being used as a simple variable.  
The reason for not pointing directly at %3(A B)%* is described
in {yonss(P5)}. 

It should now be clear how
to program the primitive predicate, %3atom%*: "is %3car%* of the argument
to ⊗→%3atom%*↔← the special atom-indicator?" If it is then %3atom%* gives
value %3T%*, otherwise %3atom%* gives value %3NIL%*.

.P29:
How about the representation of non-primitive functions?
Non-primitive functions are defined using λ-expressions.  What we do is 
define a new indicator, ⊗→%3EXPR%*↔←, which tells 
the evaluator that the function is a λ-expression.  For example,
here is  part of the atom-structure for %3FACT%*, the Sexpr form
of the factorial function.



.<<atom struct for fact>>
.GROUP
.GROUP SKIP 15;
←%2Atom-structure for %3FACT%1.
.APART

This picture should tell you at least two things: one, that programs
are (almost) stored as binary trees, and two, it should tell you what the function,
⊗→%3define%*↔←, does. %3define%* ({yon(P51)}) puts the λ-definition under the indicator %3EXPR%*
on each of the named atoms (functions).

The world becomes much simpler if we store each atom uniquely.
This is the reason for saying "almost" above.  That
is, every reference to the atom %3A%* is actually a pointer to the same
"binary tree", the binary tree whose %3car%*-part is the special atom indicator,
and whose %3cdr%*-part is the ⊗→p-list↔← for the atom %3A%*. Thus %3(A . A)%*, which
.GROUP
we have been representing as:

.BEGIN NOFILL;TURN ON "\";TABS 30;

\%7[~~~][~~~]
\%3  A        A
%1
 

is more realistically represented as:

\%7[~~~][~~~]


\%1                to atom-structure for %3A%1.

.END
.APART

.P48:
So the internal structures of this implementation of LISP are not binary trees;
there are intersecting branches. Structures of this sort 
we will call %2⊗→list structure↔←%*. 
LISP thus deals generally with binary list structure.
Trees are a subset of list structures.
We will see in a moment that many of the
computations which we have been performing have also been generating
list structure.

Question: Assume we have the above dotted pair as a value  of a variable %3x%*
and we wish to print the value of %3x%*.
Clearly we would hope to see "%3(A . A)%*" appear on the output  device.
We would expect that the print routine, named %3print%*,  would be given 
a pointer to the dotted pair. %3print%* can recognize that it %2is%* a dotted
pair since %3car%* of it is not %7α%*.
But how can %3print%* distinguish %3(A . A)%* from %3(B . B)%* for example. 
The pointer in the preceeding diagram will point to %3A%* in one case
and to %3B%* in the other but nothing in our representation tells us %2what%*
to print.  The simplest thing to do  is to store in
each atom-structure some information telling what the name of the atom is.
This is done with another indicator called ⊗→%3PNAME%*↔←, standing for
%2⊗→print-name↔←%*. Thus the atom %3BAZ%* will have at least the following:



.GROUP
.GROUP SKIP 6;
←%2Atom-structure for %3BAZ%1.
.APART

.TURN OFF "#";
The indicator is called %3PNAME%* because the print-name (or ⊗→p-name↔←)
of the atom is what the LISP output routine will print as the name of 
the atom. %2BAZ##%*  means a memory location
containing some encoding of the  letters %2B%*, %2A%*, and %2Z%*.
The symbol, %2#%*, represents some non-printing character; thus we
are assuming a location can contain five characters.
We represent the print-name as a list so that we may allow atoms with
p-names of unbounded length. Thus the p-name for %3BLETCH%*, 
would be:
.TURN ON "#";

.GROUP
.GROUP SKIP 6;
←%2P-name structure for %3BLETCH%1.
.APART

How do we get atoms stored uniquely?  ⊗→%3read%*↔← does it. On reading
an atom name (which is the way almost all atoms come into existence),
the %3read%* program checks the current symbol table.  If the atom does
not appear we add a new entry consisting of the p-list with a single
attribute-value pair ----#the#%3PNAME%*,#p-name#pair#---. If the atom %2does%*
appear, we bring back a pointer to the appropriate entry.

Before talking more about %3read, print%* and the other input-output functions in LISP,
we should discuss the storage of list structure. The implementation
described closely follows that for the PDP-10, though most of
the description is machine independent.

First, though Sexprs and now atoms are represented as binary list structure, there will
be quantities which we wish to represent in memory which are not
structures.  For example, numeric constants, and the actual encoding of 
print-names for atoms are not structures.  These quantities will be stored
in an area called %2⊗→full word space↔← (⊗→FWS↔←)%*. Most of the business of 
LISP is massaging of list structure and binary trees, however; such nodes, which have a
%3car%*-part and a %3cdr%*-part are stored in a separate area called 
%2⊗→free-space↔← (⊗→FS↔←)%*.
The name, free space, will become apparent in a moment.  Each machine
word in FS is divided into two parts; the left half contains a pointer to
the %3car%*-part and the right half contains a pointer to the %3cdr%*-part.
The pointers  may point into FS or FWS.  In more detail, consider this
representation of

.GROUP
←%3(A .(B . C)):%*

.GROUP SKIP 10;
←%2A representation of %3(A .(B . C))%1.
.APART

Obviously the actual addresses are irrelevant. If we replace the addresses with pointers
to the appropriate cells the resulting diagram will contain the same information.
The origin of the LISP "box notation" should be obvious.

We can now describe an implementation of %3eq%*.
Since atoms are stored uniquely, the predicate, ⊗→%3eq%*↔←, 
need only check that its arguments are both atomic and both point
to the same binary tree or location in memory.  In most implementations
the check for atom-ness is suppressed and a simple address comparison
is made. Non-atomic sexpessions are not usually stored uniquely. Thus

←%3eq[(A . B);(A . B)]%* is usually false, but

←%3eq[x;x] %*is usually true for any %3x%*.

Please note that we are %2not%* changing the definition of %3eq%*. 
%3eq%* is still ⊗→undefined↔← for non-atomic arguments.
The preceding remarks deal only with the usual implementation of %3eq%*.

The implementation of %3car%* and %3cdr%* present no problem.
Simple pointer manipulation will suffice. 

.GROUP
.GROUP SKIP 6;
←%2The effect of %3car%1.
.APART

.P28:
Finally, a remark about conditional expressions.  Most versions of LISP
relax the behavior of the predicates, p%4i%*, appearing in conditional exressions
such that instead of testing for the values %3T%* and %3NIL%* we
test only for %3NIL%* and non-%3NIL%*.  That is, anything other than %3NIL%*
satisfies the p%4i%*.  This allows such coding tricks as: 

.BEGIN SELECT 3;TABIT1(30);TURN OFF "←";
\[x ← cdr[x] → ...]

%1which is equivalent to:%*

\x ← cdr[x];
\[not[null[x]] → ...]

.SELECT 1;
(Recall that the value of "←"  is the value of the RHS.)
.END
As with most coding tricks, they should be avoided like the plague.

Again, please note that this discussion is not a change to our definition
of conditional expression, but an implementation dependent feature.
.<<pic of NIL>>
.SKIP TO COLUMN 1;
.SS(A picture of the atom %3NIL%*,%3NIL%*)
We have been writing the atom %3NIL%* as:
.GROUP SKIP 12;
Where the atoms for %3PNAME%* and %3VALUE%* are represented as:

.GROUP SKIP 12;

More realistically we should represent %3NIL%* as:
.NEXT PAGE;
.SS(A first look at %3cons%1)
%1

The effect of the ⊗→%3cons%*↔← function is quite different from that
of the other primitive LISP functions.  The other functions (or
predicates) manipulate existing Sexpressions, whereas %3cons%*
must construct a new sexpression from two existing  sexprs.

That is, given  representations of two sexprs, say %3x%* and %3y, cons[x;y]%*
is to get a new cell, put the representation of %3x%* in the %3car%*-part of 
the cell and the representation of %3y%* in the %3cdr%*-part:

.BEGIN TURN ON "\";NOFILL;TABS 10,30,45;
.GROUP

result of %3cons[x;y]%*


\\%7[~~~~~][~~~~~]

\      %1rep. of %3x\\     %1rep. of %3y%1
.APART

.END
Where is %3cons%* to obtain these new cells?  This is accomplished by the ⊗→free
storage area↔←. Within the free storage (FS) area resides all the cell
which have a %3car%*- and %3cdr%*- part.  The cells which contain the actual
p-names we know reside in the full word space (FWS).  At any point
in a LISP computation there are cells in the FS are which do not contain
any parts of the user's computation.  In particular, when the computation
is begun, only the atom structure for the initial LISP symbol table
uses cells in the FS area. The remaining FS cells form a %2free-space list.%*
Whenever a %3cons%* needs a cell the first cell in the FS list is used and
the FS list is set to the %3cdr%* of the FS list.
As the computation continues, more and more cell are taken from the FS list.
When a %3cons%* operation asks for a cell and the FS list is empty, the
computation is suspended and the %2⊗→storage reclaimer↔←%* or %2⊗→garbage collector↔←%* is
called.

.SS(Garbage collection,garbage collection)

During the course of a computation, contents of cells which were taken 
from the FS list often become unnecessary. For example during the evaluation
of something as simple as:

←%3(CONS (QUOTE A)(QUOTE B)),%* many cell are used:

%21.%* At least seven cells are needed just to read in the expression.
If some of the atoms are not present in the symbol table,
more cells will be needed.

%22.%* One cell will be needed to perform the %3cons%* operation.

After the computation is completed, none of the eight mentioned cells are needed.
They are garbage.
Why not simply return these `garbage  cells' explicitly to the  FS list?
Frequently it is very difficult to know just exactly which cells
%6are%* garbage.  Indeed, experiments have been performed in which LISP
programmers were allowed to return `garbage' to the FS list themselves.
The results were disasterous; list structure, thought to be garbage 
was returned to the FS list by the programmers, unaware it was still being
used by other computations. We will see where these difficulties arise
later.

Thus reclamation is passed to the LISP system.  The %3cons%* function and
its FW space counterpart, %3fwcons%*, are the only functions allowed
to mainpulate the free storage lists.  When either list becomes empty, the
garbage collector is called.  

%2←The fundamental assumption behind garbage collection is:%*

.NARROW 10,10;
At any point in a LISP computation, all cells which contain
parts of the computation are reachable (by %3car-cdr%* chains)
through a fixed set of known cells or registers.
.WIDEN;

The first phase of the garbage collector, called the %2⊗→marking phase↔←%*,
marks all of the list structure which is currently active (reachable
through the above mentioned fixed cells). This should include all the
atoms in the symbol table and all the cells reachable by %3car-cdr%* chains
from any of these atoms. Also any partial computations which have been
generated must also be marked. Once all the active structure has been
marked, we proceed to the second phase, the %2⊗→sweep phase↔←.%*

Once the marking has been completed, we go linearly (sweep) through memory,
collecting all those cells which have not been marked.  Again,
the assumption is that if cells are not marked in the first phase, 
then they do not contain information necessary to the LISP computation.
These unmarked cells are chained together via their %3cdr%*-parts to form 
a new FS list. The FS pointer is set to the beginning of this list;
similarly, the unmarked cells in FWS comprise the new FW list.

A more detailed discussion of this garbage collector 
will be given when examine the details of implementation in {yonss(P26)}.
And a more complex algorithm is discussed in {yonss(P27)}.

Garbage collection appears to be a very complex and time-consuming
process. Indeed it is.  As indicated above, there are alternatives in some
applications.  If the data-structure manipulations are particularly simple
then leaving storage management in the programmer's hands might
suffice. However, LISP generates very intertwined structures. 

Perhaps there is
another way to allow the system to handle storage management.
First notice that storage management becomes quite simple if there is no sharing
of sublists. The question of when to return a list to the free space list
is easily solved. However it is desirable to share substructure quite often.
Instead of using a garbage collector, we might  associate a counter, called
a %2⊗→reference counter↔←%*, with each
list when it is built. In that counter we will store the number of references
to that list. Thus the counter will be initialized to 1 when the list is created.
Whenever the list
is shared we increase the counter by 1; whenever the list is no longer to
be shared by  some list we decrease the counter by 1.
When the count goes to 0 we can put the cells of the list in the free space list.
One of the most glaring defects in this reference counter scheme is the
prohibition of circular lists. Consider the following sequence:

.BEGIN NOFILL;GROUP;TURN OFF"←";
%21.%* Manufacture a list,%3x%*: %3x ← (B I G M O T H E R L I S T)%1. Reference count is 1.
%22.%* Circularize it: %3x ← circle[x];%*.  Reference count is now 2.
%23.%* Delete all references to %3x%*: %3x ← NIL%*. Reference count reverts to 1,
        but the list is not referred to, is not on the free space list, and has thus
        been lost to the system.
.END

←%2Problems%*

I  This problem deals with what is known in LISP as %2⊗→hash consing↔←%*.
We have been storing atoms uniquely, but it should be clear from the behavior
of %3cons%* that non-atomic sexprs are %2not%* stored uniquely.
Certainly storing single copies of any sexpr would save space. For example,
the non-atomic structure of
%3((A . B).(A . B))%* could be represented with two cells rather than three.
Unique storage is not without its difficulties. What problems do you forsee
in implementing such a scheme? 

II We said on {yon(P48)} that many LISP computations generate list structure
rather than true binary trees? Give an example.

.P55:
II Can you write a LISP function %3circle <= λ[[x] ... %* which will take
a list %3x%* and make it circular. Thus:

.BEGIN TABS 8,30,47,65;NOFILL;TURN ON "\";GROUP;
%7

\[~~~~~~~[~~~]\[~~~]~~~]\[~~]~~~]\[~~~~]~~~]
%3\NOTHING\ CAN\ GO\ WRONG
.END
This list is circular on its "last two" elements.
Needless to say printing such structures is not appreciated.


.SS(%3read%* and %3print%*,,P13:)
.TURN ON "←";

When you begin to implement LISP you find that a very significant
part of LISP can be written in LISP.  We have already seen that the
evaluation process is expressible this way.

Here we will see that the majority of the I/O routines can be
written as LISP functions calling a very few primitive routines.
The primitive routines can also be described in `pseudo-LISP'.
That is, we will give a LISP-like description of them, though
they would normally be coded in machine language.

The primitive functions are ⊗→%3ratom%*↔← (read an atom) and ⊗→%3prin1%*↔←.

.BEGIN INDENT 0,10;
.GROUP
%3ratom[ ]%* is a function of no arguments. It reads the next atom
or special character (left paren, right paren or dot).
It looks up that object in the symbol table and returns
a pointer to that symbol table entry.  If no entry
is found an appropriate entry is made.
%3ratom%* is the LISP ⊗→scanner↔←. %3ratom%*  flushes spaces and commas,
using them only for delimiters. It returns only atoms or special characters
to %3read%*, the LISP ⊗→parser↔←.
.APART

.GROUP
%3prin1[x]%* is a function of one argument expecting an atom, left paren
right paren, blank, or dot as the value of its argument.
It will print the p-name of that object on the output 
device.
.APART
.END

To make life simpler we need to be able to refer to atoms whose values are
the characters "%3)%*", "%3(%*", ".", and " "(blank).  
We will assume that ⊗→%3rpar%*↔←, ⊗→%3lpar%*↔←,
⊗→%3period%*↔←, and ⊗→%3blank%*↔← are such atoms.  For example, if the next input
character is "(" then

←%3eq[ratom[ ];lpar]%* is true (and the input pointer is moved to
the next character!).

%3prin1[period]%* will have the effect of printing a %2"."%* on the output 
device.

We will discuss the structure of %3ratom%* and %3prin1%* in a moment.  In 
the meantime here are %3read%* and %3print%*:
.BEGIN TABIT2(17,20);TURN OFF"←";SELECT 3;

.GROUP
⊗→%3read%*↔← <=λ[[ ]prog[[j]
\j ← ratom[ ];
\[eq[j;lpar] → return[read1[ ]];
\ or[eq[j;rpar];eq[j;period]] → LOSEBIG1;
\ T → return[j]]]]
.APART
.GROUP

⊗→%3read1%*↔← <= λ[[ ]prog[[j;k]
\\j ← ratom[ ];
\\[eq[j;lpar] → return[cons[read1[ ];read1[ ]]];
\\ eq[j;rpar] → return[NIL];
\\ eq[j;period] → go[rd];
\\ T → return[cons[j;read1[ ]]] ];
\rd\k ← ratom[ ];
\\[eq[k;lpar] → k ← read1[ ];
\\ or[eq[k;rpar];eq[k;period]] → LOSEBIG2];
\\j ← ratom[ ];
\\[eq[j;rpar] → return[k];
\\ T → LOSEBIG3]  ]]
.APART
.END

Here's %3print%* and friends. %3terpri%* gets a new output line.
(note: the value of %3print[x]%* is %3x%*.)
.BEGIN TABIT2(17,20);TURN OFF "←";SELECT 3;

.GROUP
⊗→%3print%*↔← <= λ[[x]prog[[ ]
\prin0[x];
\terpri[ ];
\return[x]]]
.APART

.GROUP
⊗→%3prin0%*↔← <= λ[[x]prog[[j]
\\[atom[x] → return[prin1[x]]];
\\j ← x;
\a3\prin0[car[j]];
\\[null[cdr[j]] → go[a6]]
\\prin1[blank];
\\[atom[cdr[j]] → go [p1]];
\\j ← cdr[j];
\\go[a3];
\p1\prin1[period];
\\prin1[blank];
\\prin1[cdr[j]];
\a6\prin1[rpar];
\\return[x] ]]
.APART
.END



So far we have thrown all the I/O back on %3ratom%* and %3prin1%*.  Clearly
%3ratom%* will be more interesting. All %3prin1%* need do is get the p-name and
print it.  %3ratom%* needs to search the symbol table (efficiently hopefully), 
and if
the atom is not found, add it to the table.  All %3ratom%* has to work
with is the actual character string (called %3chr-str%* in the future) which
will be the p-name of some atom.  What %3ratom%* could do  is look at the 
p-name of each atom currently in the symbol table; when it finds a match
it returns a pointer to the beginning of that atom. 
(Notice this is essentially the search scheme of %3assoc%*.)
If the appropriate
atom is not found it can build a new one consiting of the p-name, add it
to the table, and return a pointer to this new atom.  This would
have the appropriate effect; each atom would be stored uniquely, but
the efficiency would leave something to be desired.

←%2Problems%*

***** non-recursive reader...better worse***

*** slashifying ***
.SS(Hashing,hashing,P14:)
Symbol table lookup is exactly the problem of looking up 
words in a dictionary.  The above scheme of %3assoc%* (linear search) 
is analogous to a person beginning at page one of the dictionary and 
proceeding linearly (page-by-page and word-by-word) through
the book until he found the word in question.  Truly a losing
scheme. What we normally do is look at the first character of 
word and go immediately to the subsection of the dictionary which 
has the words beginning with that character.  We know that if
we cannot find the defintion of our word in that subsection we need 
look no further in the book.  Usually we delimit our search even
further by keying on subsequent characters in the word.  Finally
we may resort to linear search to pin down the word on a specific
page or column.
What we want is a similar scheme for the machine.  We might in fact
mimic the dictionary search, subdividing our table into 26 subsections.
We can do better.

Since it is the machine which will subdivide and 
index into the symbol table, we can pick a scheme which is computationally
convenient for the machine. Besides being convenient, the scheme should
result in rather even distribution of atoms in the subsections. 
Obviously if the majority of the atoms end up in the same partition
of the table we will have gained little towards improving the
search efficiency.  Now for terminology.  Each of the partitions 
or subdivisions of the table is called a %2⊗→bucket↔←%*.  Each atom will
appear in at most one bucket.  The computational scheme or function
used to determine which bucket a particular atom belongs in is called
a %2⊗→hashing function↔←%*.  All ⊗→%3ratom%*↔← has to work with is %3chr-str%*, the 
encoding of the actual name of the atom.  So the hashing function
will use %3chr-str%* to determine which bucket to examine.  If the atom
with ⊗→PNAME↔← %3chr-str%* does not appear in that bucket we are assured
that it does not appear anywhere in the table.  In this case we
make up the minimal table entry --#atom#indicator, %3PNAME%*, p-name structure#-- 
and add it to that bucket.

There are lots of hashing functions. Here's one:

%21.%*  Assume that we have N+1 buckets, numbered 0, 1, 2 ... N.

%22.%*  Take the representation of %3chr-str%* (it's a number) and
divide it by N+1.

%23.%*  Look at the remainder.  It's a number between 0 and N.

%24.%*  Use that remainder as the index to the appropriate bucket.

This is a scheme frequently used in LISP.  Given the bucket number,
we then run down the list of atoms in that bucket, comparing
print-names against %3chr-str%*.  If the atom is not found, %3cons%*-up
the atom and stick it in the bucket.

There is a lot of mystery and history about hashing  and other means 
of searching and storing in symbols. References are given in the
bibliography. 

In review, here is the structure of the LISP input mechanism:

%21.%*  %3read%*, the LISP ⊗→parser↔←, builds a list-structure representation
of the input string.  %3read%* is defined in terms of %3ratom%*.

%22.%*  %3ratom%*, the LISP ⊗→scanner↔←, builds atoms and special characters
from the input.  It searchs and increments the LISP symbol table.

%23.%*  The LISP symbol table, usually called ⊗→%3OBLIST%*↔← (for %2⊗→object list↔←%*),
is a list of buckets.  Each bucket is a list of the atoms which `hash' to that bucket.
We actually represent the object list as an array named %3oblist%*.
This way the hash number will give us the array subscript  and we can 
go to the right bucket immediately.
We won't have to go "cdr-ing" down the object list to get to the right bucket.
See figure on next page.

Here is a `pseudo-LISP' version of ⊗→%3ratom%*↔←:

.BEGIN TABIT2(17,20);FILL;

%3hash%*\is a function returning the bucket number of its argument.

%3insert%*\is a function to build the atom and insert it into to bucket.

%3right-one%*\is a predicate used to check if an atom  
has the right print-name.
.NOFILL

.TURN OFF "←";
.SELECT 3;
⊗→ratom↔← <= λ[[ ]prog[[z;i;bucket]
\\i ← hash[chr-str];
\\bucket ← oblist[i];
\a\[null[bucket] → return[insert[chr-str]]];
\\z ← get[car[bucket];PNAME];
\\[right-one[z;chr-str] → return[car[bucket]]];
\\bucket ← cdr[bucket];
\\go[a]]]
.END

%3get%* is a LISP system function. It is described on {yon(P52)}.
.SKIP TO COLUMN 1;
.SS(A review of the structure of  the LISP machine,LISP machine)
.BEGIN TABIT2(10,20);GROUP;


\_______________________________________
\\⊗→%3eval%*↔← and friends
\\⊗→%3read%*↔← and ⊗→%3print%*↔←
\\the ⊗→garbage collector↔←
\\the base registers for marking
\\  ⊗→FS pointer↔←
\\  ⊗→FWS pointer↔←
\\  symbol table(⊗→%3OBLIST%*↔←) pointer
\\  registers for partial results

\_______________________________________
\\⊗→Free space↔←
\\ the free space list
\\ those parts of sexprs containing
\\   %3car%*- and %3cdr%*-parts.

\_______________________________________
\\⊗→Full word space↔←
\\  the full word space list
\\  atom print names
\\  numbers
\_______________________________________
.END
.NEXT PAGE;
.SS(More on symbol tables,symbol tables)

←%2Subtitled: But my Fortran program doesn't do all this crap!%1

It is quite true that a running  Fortran, PL/1, or Algol  program
is far less complicated as far as its symbol accessing mechanisms
are concerned.  In Fortran there is a simple relationship between
variables and memory locations which will contain their values;
in Algol, there is a simple relationship between variables and positions
in the run-time stack.

In LISP, both the quality and the quantity of variables can change.
Arbitrary properties can be associated with atoms at run-time.
Indeed, the symbol table mechanism of LISP is more reminiscent of
that associated with the Fortran or Algol compiler.  For these less
sophisticated languages it is the compiler which performs the mapping
from source language to running machine code.  It is the compiler's 
responsibility to discover the properties associated with each variable.
The compiler can do this because the semantics of the language is such
that at compile all (or almost all) properties of the variables are known.
This is not true for LISP.  In general you cannot tell until run time what
the attributes of a particular atom are.  The situation is really even worse
than this.  Since programs and data are indistinguishable, we can %3cons%*-up a 
list, assign it to a variable, then turn around and use that variable as 
a function.  Try that in Fortran.

Now the world isn't all this grim.  There are LISP compilers (written
in LISP naturally). They can make many decisions at compile time
about the properties of variables  But in general the compiled code
will be interspersed with calls on %3eval%*.  That is, %3eval%* will have to make 
some decisions at runtime, because the compiler just doesn't know what to do.
This implies that compiled and interpreted code must be able to 
communicate with each other.  If a piece of compiled code calls a 
λ-expression or conversely, the right thing must happen.  The execution
of the program should be totally transparent as to whether any, or all, or
none of the functions are compiled.  This means that the calling
sequences for both kinds of functions must be compatible. Less
obvious and by far more troublesome, is the communication of global
values between functions.
.SS(Global Variables,global variables)

A variable used globally (or free) in a function (or %3prog%*) is one which
does not appear in the λ-list or in the list of %3prog%*-variables.

For example in:

←%3λ[[x]*[x;y]], y%* is global.

Communication of global variables between interpreted functions
is not problematic:  just look down the atom structure for the %3VALUE%*-cell. 
 Handling of global variables by compiled functions requires some 
care.  The ⊗→%3VALUE%*-cell↔← mechanism is designed to accomplish this.
Essentially, the %3VALUE%*-cell will %2always%* contain the current value
of its variable. The compiler then needs to be able to find this cell and
generate efficient code to manipulate its contents.

What we will do now is:

%21.%*  Introduce a graphical language, based on ⊗→AMBIT/G↔←, to describe
many of the operations of LISP ({yonss(P26)}).

%22.%*  Describe  parts of the dynamics of evaluation using the ⊗→Contour
Model↔← ({yonss(P31)}).

%23.%*  Describe the mechanism for user-defined special forms, and another
defining mechanism called ⊗→macros↔← ({yonss(P18)}).

%24.%*  Give a new improved definition of %3eval%* and Co. which uses the
new symbol table mechanism
and evaluates forms, special forms and macros.  
Show how the ⊗→bind↔←ing mechanism, called
%2⊗→shallow binding↔←%*, operates ({yonss(P30)}).

%25.%*  Introduce a simple stack machine, %aSM%*, mapping the graphical
descriptions onto a ⊗→PDP-10↔←-like machine ({yonss(P33)}).

%26.%*  Write a LISP function which will operate as a ⊗→compiler↔←, translating
the Sexpr representations of LISP functions into `equivalent' sequences
of %aSM%* code ({yonss(P32)}).

%27.%*  Then we can see how the problem of global variables is handled ({yonss(P5)}).

In parallel to %2this%* development will run a presentation of an 
alternative binding strategy called %2⊗→deep binding↔←%*. 
We will also present an alternative calling structure 
which works well with deep bindings and is not so hardware
oriented as its competitor.
These alternatives are perhaps more intuitive implementations our original a-list 
version of %3eval%* than is the shallow binding scheme. We will contrast their
relative efficiencies.
.SS(AMBIT/G,AMBIT/G,P26:)
.SELECT 1;
AMBIT/G⊗↓An ambit is half aminal, half hobitt. AMBIT/G also is an acronym
for Algebraic Manipulation By Identity Transformation/Graphical.←
is a graphical language for the description of both
data and algorithms.  We will explore  a few aspects of AMBIT,
using it only to  describe most of the basic operation of LISP. AMBIT
is a powerful graphical language suitable for describing complicated
data structures and their manipulation. A crucial problem of non-numerical  
applications is the structuring of the data. Indeed it is frequently the case
that the structuring of the data is the %2only%* problem; once the 
proper representation has been established a very simple program to 
manipulate the complex representation will suffice. Often
what appears to be a complicated algorithm is, in reality, a simple algorithm
operating on complex data. 
A poor representation will lead to an inefficient program. 
Therefore in AMBIT, programming is attempted only after the data design
has been completed.

Since  the 
interrelationships between data structures are inherently more static than
the executing of an algorithm, it is also beneficial to separate complex data
from simple program ⊗↓Perlis: "suppress the constant and display the variable".←.
Proofs of correctness of programs designed in this way should also be more easily
obtained. There are difficulties in designing programs in this manner; a 
significant difficulty lies in the paucity of notation which we have available
for describing algorithms. Current  notations for programming languages are derived
from linear mathematical notations. These notations are ill-suited for
describing data structure representations. 
As a very simple case, functions in mathematics are single-valued, but often 
in programming we would like to return more than one value. In particular we
would like to return values which can be used immediately as arguments to another
function. A generalized composition if you wish.  Mathematical notation
is not conducive to such operations even though  the operations is
quite natural and easily understood from the model of execution.

If we do not want to use a high level language then
the other common choice is to encode the
algorithm in machine language. The objections to this approach are clear: the
program is incomprehensible to other people, and all too frequently it is
poorly understood even by its creator.

When we think about writing a complex non-numerical program like a compiler,
a proof-checker, or a piece of a large system, we draw pictures to 
help crystalize our ideas and to structure the data efficiently.
When faced with explaining a complex structure-manipulating
program to someone  we draw a picture. Clearly then, a graphical notation
for programming is worth exploring.  AMBIT is such a language.

First, the data representation in AMBIT is a generalization of the
%2box notation%* of LISP.  We have already seen that it is often
helpful to draw pictures to understand the data representation
in a particular problem.  Instead of requiring that we draw all
nodes in a tree as boxes, we might convey more of our intuition
by drawing nodes of different shapes to represent nodes which contain
different kinds of information.  We have already begun to do
this on a small scale.  Words (nodes) in Full Word  Space
are drawn:###%7[~~~~~~]%*###   whereas nodes in free space are 
drawn:###%7[~~~~]~~~~]%*###  even though as far as most machines are 
concerned both kinds of nodes map into the same kind of memory cell.

AMBIT/G exploits this generalization of shapes.
For example ⊗↓These examples are taken from the formal definiton of BASEL.← 
in a typed language containing types integer,boolean, structure,and real 
we  represent these modes as:

.BEGIN GROUP;centerit;
.GROUP SKIP 6;
←%2Example of type shapes
.END

.BEGIN TURN OFF "←";GROUP;
We might represent the result of executing of %3x ← 2%*, where %3x%* is of type %3int%*, as:
.GROUP SKIP 10;
.END

If the language also has type definition facilities, then we might represent
mode %3complex%* as:

.BEGIN GROUP;CENTERIT;
.GROUP SKIP 10;
←%2Mode of %3complex%*.
.END


.BEGIN GROUP;CENTERIT;
.GROUP SKIP 10;
←%3y %1is%* complex[1.0 ; 1.0]
.END

 This is interesting and useful notation but in 
itself gives us no new power.  The innovation is that we can
also describe our algorithms as graphical transformations of the
data graphs.  The basic statements of the language are %2pattern-match-
and-replacement%* rules.  That is, if a designated pattern can be
found in the current state of the data graph, then replace it with
a new pattern.  The only kind of replacement allowed is the %2swinging%*
of an arrow so that its head moves from one node to another.  
Where the arrow head strikes a node is immaterial; the origin of the tail
%2is%* important.
The tails of the arrows are fixed (as are the number of data nodes
and arrows).  The individual statements are combined into an
algorithm by success and failure conditions.  If a pattern-match-replacement 
is successful then we go to one AMBIT/G statement,
if the match fails then we go to another statement.

The other two components of an AMBIT/G program are declaration
and initialization.  We declare the shapes which will be used in
the algorithm and declare the number and relative positions of the
arrows which can radiate from each type of node.  The initialization
(which we usually suppress) sets the initial pattern of the data graph.
An example is long overdue:


.<<pic of AMBIT/G for car>>
.BEGIN GROUP;SELECT 2;centerit;
.GROUP SKIP 20;




←picture of AMBIT/G algorithm for %3car%*

.END

This algorithm is represents the action of the %3car%* routine.  Arrows
which are suppressed are not needed for matching.  The solid links
represent links which must be found if the rule is to succeed and the 
dotted links represent links which are to be set (as solid links) if
the match succeedes.  The exit labeled S is used if all goes as
planned. 
The circle is a notational convenience: it matches any shape.
Thus in this example, %3car%* would succeed as long as AC1 points to a 
cell in free space; otherwise we get an error.

Or here's a more complicated example:

.<< pic of reverse>>
.GROUP SKIP 10;

In this piece of program we have a new element: nodes or (solid) links
whose perimeter is circled are to be tested after the match has been made
and before any specified modifications are executed. If the tests fail
the the F exit is taken from the program.

The general execution of an AMBIT/G block can be described in the usual 
paradigm: selection, testing, and construction.

First the data is selected. A match of the nodes and solid links is attempted.
If this match cannot be made, an error has occurred.

Next, the testing of circled components is done. If a test fails, then
the  failure exit is taken.

Finally the constructions, designated by the dotted links, are performed
and the success exit is taken.

**** add technical here: data graph permanence, functionality, constranits
and general b.s.***

On following pages are AMBIT/G algorithms for ⊗→%3cdr%*↔←, ⊗→%3cons%*↔←, 
⊗→%3eq%*↔←, and ⊗→%3atom%*↔←.
The following conventions pertain to these implementations:

%21.%*  We assume AC1, AC2, ...,ACn  are pointers to the n arguments
of a function (requiring n arguments). 

%22.%*  We assume that every function, on completion of its action, will set
AC1 to point to the value of that function.  

.TURN ON "#";
There are three basic shapes: 
##%7α~~~~]%*,##%7[~~~]~~~]%*##  and ##%7[~~~~~~]%*##  corresponding
to atom headers, free space and full word space objects.  Pointers will be given a
separate shape:###%7A##%*;  and when we describe the garbage collector
we will need a `marked-unmarked' shape: ##%7G%*##  .  One final point:
the `down arrow' on ##%7[~~~]~~~]%*##  and ##%7[~~~~~~~]%*##, is usually implicit
in implementations. What does it represent?


******picture of shapes****
.BEGIN GROUP;CENTERIT;SELECT 2;
.GROUP SKIP 30;
←Picture of Shapes
.END

***review***

Graphical languages are a very natural and powerful means of describing  processes.
As  examples,  the next sections will give two LISP garbage collectors, one
very natural, one more complicated but easily understood; then a AMBIT/G
description of parsing techniques due to T. Cheatham, called ⊗→syntactic dominoes↔←;
and finally an AMBIT/G description of the operators for the AI chestnut, called
the Monkey and  Banannas problem.
.SS(garbage collection again,garbage collection)

Notes: The right-side arrows on ##%7α~~~]%*## and ##%7[~~~]~~~]%*##
are used by the garbage collector.  First we sweep from
##%7[~~~~~~]%4TOP%1# to##%7[~~~~~~~~~]%4BOTTOM%1#,##
setting the side arrows to  %7G%4NA%1#;

when we mark, we reset those nodes which are reachable to point to  %7G%4A%1.
Then the final sweep phase will collect all those nodes still marked,
%7G%4NA%1## into a new free space list, pointed to by   FSP .
%7α[~~~~]%4<name>%1## is the atom header;  <name> need not be present.

%7A%4MKR%1## is a pointer used during the garbage collection.

%7A%4P%1## is a pointer used to save the cdr parts of lists during 
the marking phase of the garbage collector.

%7[~~~~]%4TOP%1,## %7[~~~~~~]%4BOTTOM%1##  are used to delimit the boundaries of
free space.

The ⊗→garbage collector↔← has been slightly simplified.  We should mark
more than just  ⊗→OBLIST↔← (the symbol table); for example, what
%7A%4AC1%1 and %7A%4AC2%1 point to should be marked.  There are other 
intermediate results which must be preserved; these will become 
apparent as we proceed.

We must also be careful about the order in which the dotted lines are
`executed'; often we will number the arrows to indicate the order of
execution.

Following this introduction is the main structure of the first LISP garbage 
collector.  The heart of the collector is the marking algorithm.
Some care need be excercised here. Since we are marking binary list
structure in a sequential manner, we need to make a decision as to
whether to mark the ⊗→%3car%*↔←-part first or mark the ⊗→%3cdr%*↔←-part first.
Actually the order in which we do these operations isn't important.
What %2is%* important is that while we are marking the first-chosen
branch, we must remember where the other branch is so that we
can subsequently mark it.  Also since this marking process is 
obviously recursive -- branches may also have branches-- we
must be prepared to remember an arbitrary number of `other branches'.
The pointer, P, is used to help remember these pending branches.

As you chase the arrows in the AMBIT/G marking algorithm, notice
that:

%21.%*  We always mark the %3car%*-branch first. This is usually called
pre-order traversal.

%22.%*  As we prepare to examine the %3car%*-part of a tree we save a pointer
to that node of the tree, using P.

%33.%*  After we have marked the %3car%*-part, we use the information saved by 
P to traverse to %3cdr%*-part.


P is pointing to a sequence of cells linked together by their 
`down arrows'.  When we go down the %3car%*-part, we save the parent-node
in the cell currently being pointed to by P.  Then we set P to it's
`down arrow' successor:
.BEGIN GROUP;centerit;
.GROUP SKIP 10;
←%2Push

.END

As long as the %3car%*-part of the structure we are marking points
into free space, we will continue to update P.  If you look back
through the elements whcih we have added to P you will see a
history of those nodes whose %3car%*-parts have been marked, but whose
%3cdr%*-parts are still to be examined.  When we finally terminate the 
%3car%*-marking we will pick off the last node in P, decrement P, and
then traverse that substructure:

.BEGIN GROUP;centerit;
.GROUP SKIP 10;
←%2Pop

.END
Thus, P and its associated cells are behaving as a stack.

Recall that we are storing information as list structure and thus
have intersecting branches.
Now since we do have intersections, the marking process can be a little 
 clever.  As soon as the marker comes across a previously marked
cell we know that everything below that point in the structure has already
been marked.  We can then terminate the marker on that branch.

Here then, is the simple AMBIT/G garbage collector for LISP followed by
a partial description in LISP.

.CENT(A simple LISP garbage collector in AMBIT/G)

.NEXT PAGE
.NEXT PAGE

.CENT(A simple LISP garbage collector written in LISP)
We will now write a garbage collector in LISP to mark and sweep nodes
in FS and FWS.
It will have three main functions:

.BEGIN INDENT 0,10;
%3initialize[x;y]%* to initialize the marking device for each cell in the space from
%3x%* to %3y%*.

%3mark[l]%* to do the actual marking of a list %3l%*.

%3sweep[x;y]%* to collect all inaccessible cells in the space delimited 
by %3x%* and %3y%*.

.END
%3initialize%* will be called twice; once for FS and once for FWS.
%3mark%* will be called for each base register which points to 
active (binary) list structure.
The basic structure of the marker is: if the word is in FWS mark it and leave;
if the word has already been marked simply leave. Otherwise the word is
in FS and thus has a %3car%* and a %3cdr%*; mark the word; mark the %3car%*;
mark the %3cdr%*. The marker is thus recursive; all of the stack manipulation
done by the AMBIT program will be hidden by the recursion.

%3sweep%* will be called twice; once to generate a new free space list and once
to generate a new full word space list. Elements of these free 
lists will be chained together by their %3cdr%* parts.
The initialization and sweep phases of this garbage collector are very similar;
both phases must be assured of reaching every node in the space. We used the
%3down%*-arrows for this.

These main functions use several other functions and predicates:

.BEGIN INDENT 0,10;
%3fwswrdp[x]%* is true just in the case that %3x%* is a word in FWS. This is
used as one of the termination conditions of %3mark%*.

%3makeA[x]%* marks word %3x%* as accessible.

%3makeNA[x]%* marks word %3x%* as not accessible.

%3NAp[x]%* is true if word %3x%* is not accessible.

%3down[x]%* follows the "down" link of the node %3x%*.

%3conca[x;y]%* attaches node %3x%* the the front of the list %3y%*.
The value returned is the new list.
 %3conca%*
is easily described in AMBIT. Can you write %3conca%* as a LISP function?

.END

.BEGIN; CENTERIT;GROUP;SELECT 2;
.GROUP SKIP 7;
←AMBIT diagam for %3conca%*.
.END

.BEGIN SELECT 3;TABIT2(18,20);TURN OFF "←";

initialize <= λ[[x;y]
\prog[[]
\ a\makeNA[x];
\\x ← down[x];
\\[eq[x;y] → return[T]]
\\go [a]]]

.APART;
.GROUP;

mark <=    λ[[l] [\not[NAp[l]] → T;
\fwswrdp[l] → makeA[l];
\T → makeA[l];mark[car[l]];mark[cdr[l]] ]]

.APART
.GROUP

sweep <= λ[[x;y]
\prog[[z]
\ a\[NAp[x] → z← conca[ x;z]];
\\x ← down[x];
\\[eq[x;y] → return [z]];
\\go[a]]]

.END


.CENT(A link-bending garbage collector for LISP)
The description of this very simple collector is easily understood either as
a LISP function or as an AMBIT program. The collector we are about to describe
is more naturally described graphically.  First notice that the AMBIT program
uses an explicit stack to store traversing information whereas in the LISP
description the use of a stack is implicit in the recursion of %3mark%*.

The use of a stack is one of the difficulties associated with this simple 
collector. Garbage collection is invoked when available space has become
exhausted, but here we are, asking for %2more%* space to use for stacking.
The usual solution is to allocate a separate area for stack storage. This 
has its drawbacks. If we don't allocate enough stack space, i.e., if the
depth of a piece of structure becomes too great, then the marker will fail.
Notice to that the amount of stack space can become large; proportional to the
length of a list.

.GROUP SKIP 8;

Another means of dispensing with a stack when traversing a tree  is to
use a more complicated representation for each node. Notice that
if we examined "snapshots" of the execution of the traversal of a tree
as long as we traversed the tree each time in the same order, the contents
of the stack would be identical ⊗↓except for perhaps the actual machine locations
used in the stack←. Instead of replicating this portion of a stack on each
traversal we could save the stack information with the nodes in the tree.
This technique is called ⊗→threading↔← ⊗↓See Knuth's Kompendium←.
Threading complicated the representation and causes some difficulties when
we wish to store  list-structure rather than trees. However the idea
of dispensing with the stack %2is%* worthwhile. 
We can strike a comproimise. Instead of permanently storing
the threads in the structure we can modify the structure as we traverse it,
use the threads to mark, and then to restore the structure to its original
topology.
Certainly the marker will be more complicated, but the complication is
not insurmountable. In fact, it is even reasonably straightforward to prove
the correctness of such an algorithm ⊗↓ W. deRoever, private communication.←.

Finally, here is such a "link-bending"  marker written in AMBIT/G:

.NEXT PAGE;

To reinforce ideas, here is a simple data graph and a sketch of the
modifications to it as the marker operates:

.NEXT PAGE;
.SS(Syntactic dominoes)
The most important parts of our discussions involve data structures, i.e.,
sexprs already in memory.  An important, but peripheral problem is the input
such text, taking the linear representation of sexpressions into the
binary list-structure representation. We have mentioned, while discussing %3read%*
and friends, that such a program is called a ⊗→parser↔←. The general techniques
which are used by parsers are easily described using a game called
syntactic dominoes.  This clever device was invented by T. Cheatham and 
is described in AMBIT/G.

.SS(The Contour Model,Contour Model,P31:)
Certain phases of the execution of LISP programs, in particular function
calls and variable bindings are better described in another graphical
language, the Contour Model.

This model, though well known intuitively by any translator writer, was
recently made explicit and embellished by J. Johnston. Its ability
to clearly show the binding mechanisms in block-structured languages
is exceptional.

Note first that this  model is a different type than AMBIT/G. The
Contour Model is essentially a trace of the execution of a specific program, whereas
AMBIT/G is a reasonable candidate for describing the semantics of a 
complete language. Indeed, AMBIT/G has been used to document the extensible
language, BASEL, and LISP.

The simplest way to introduce the Contour Model is by example:
.SKIP TO COLUMN 1;
.NEXT PAGE;
.SS(Macros and special forms,macros,P18:)
Most of the discussion to this point has dealt with getting
a new efficient symbol table which can  fulfill our requirements
of permanence and  multiplicity of properties. Little has been said
about user controlled function calling, which we called the desire for 
generality. Before introducing the definition of the new evaluator
we shall deal with that aspect.

Recall our discussion of ⊗→special forms↔← in {yonss(P8}).
Special forms have been used for two purposes: to control the evaluation
of arguments (conditional expressions, quoted expressions,
%3and, or%*, etc.), and to create the effect of functions with an indefinite 
number of arguments (%3list, append, plus, ...%*)
Frequently this second application of special forms can be improved upon,
particularly in the presence of a compiler.
Describing such functions as special forms is sufficient,
but not efficient, since the body of the definition must
contain explicit calls on %3eval%*. Even though we wish to define
some functions as if they had an arbitrary number of arguments, when
we %6use%* (i.e. call) the function, the number of arguments to be
applied %6is%* known. In particular, when the compiler is examining the
program, the number of arguments is known. Hopefully the compiler can be
made smart enough to compile better code than calls on %3eval%*.
That hope is well-founded.

Assume, for example, we wish to define %3plus%* as a function with
an indefinite number of arguments:

.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = 9
←plus[4;5;6] = 15
←plus[4;add1[2];4] = 11
.END

That is %3plus%* to  have the properties of a function: its arguments
are to be evaluated (from left to right); but it can take an arbitrary number.
What is needed here seems clear: define %3plus%* in terms of a %2binary%*
addition function, %3*plus%*.

.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = *plus[4;5] = 9
←plus[4;5;6] = *plus[4;*plus[5;6]] = 15
←plus[4;add1[2];4] = *plus[4;*plus[add1[2];4]] = 11
.END

That is, we %2expand%* calls on %3plus%* into a composition of calls on
%3*plus%*.
%3plus%* is being used as a %2macro%* and the expansion process in terms
of %3*plus%* is called %2macro expansion%*. Notice that is macro expansion
can be  done by a compiler, generating a sequence of calls on %3*plus%*.
Realize too, that since LISP programs must perform equivalently when
interpreted, we must recognize a macro construction inside %3eval%*.

How are macros written and how do they work?  The body of a macro is a 
λ-expression of %2one%* argument. The %2use%* of a macro looks just like
an ordinary function call, but what is bound to the λ-variable is the whole
call on the macro.  When you look at the evaluation mechanism for
macros in the %aSM%*-%3eval%* you will see that the result of the macro
expansion is passed to %3eval%*. Thus the task of the macro body is to
expand the macro call and return this expansion as its value.
The task of the compiler will be  quite similar; it will expand the macro
but instead of evaluating the expansion it must %2compile%* it.

.BEGIN TURN ON "#";
Let's define %3<%4m%*=%1 to mean "#is#defined#to#be#the#macro#...".
Then a simple macro definition of %3plus%* might be:
.END

.BEGIN SELECT 3;TABIT2(10,25);GROUP;

.P58:
\plus <%4m%*= λ[[l]\[eq[length[l];3] → cons[*PLUS;cdr[l]]
\\ T → list[*PLUS;cadr[l];cons[PLUS;cddr[l]]]] ]
.END

Thus a call %3(PLUS 3 4 5)%* would bind %3l%* to %3(PLUS 3 4 5)%* and
the evaluation of the body would result in %3(*PLUS 3 (PLUS 4 5))%*.
Evaluation of this expression would result in another call on the macro.
This time %3l%* would be bound to %3(PLUS 4 5)%*. Now
%3eq[length[l];3]%* %2is%* true and the value returned is %3(*PLUS 4 5)%*.
This can be evaluated, giving 9, and finally the outermost call on %3*PLUS%*
has all its arguments evaluated, and we get the final answer, 12.

This is a very simple (and inefficient) macro definition. Since the body
of a macro has available all of the evaluation mechanism of LISP, and
since the program structure of LISP is also the data structure, we can
perform arbitrary computations inside the expansion of the macro. Truly
LISP macros have the power to cloud men's minds!!!

Notice that %3SETQ%*  can easily be defined as a macro
over %3SET%*:
.BEGIN CENTERIT;SELECT 3;GROUP;

←setq <%4m%*= λ[[m] list[SET;list[QUOTE;cadr[l]];caddr[l]]].
.END
.P54:
The effect of "<%4m%*=" should be clear:
it puts the body of the macro definition on the property-list of the macro
name  under the indicator, %3MACRO%*. Likewise "<=" puts the function body
on the p-list under the indicator, %3EXPR%*. Similarly, we will define
"<%4f%*=" to mean "..is defined to be a special form..." and whose effect
is to put the  body under the indicator, %3FEXPR%*.  The body of a fexpr
definition is a λ-expression of (usually) a single λ-variable, say %3l%*. When the
fexpr is called we bind the list of %2unevaluated%* arguments to  %3l%*.
Thus we could define %3plus%* by:

.BEGIN SELECT 3; GROUP;TABIT2(10,12);turn off "←";
.P59:

plus <%4f%*= λ[\[l] prog[[sum]
\\sum ← 0;
\a\[null[l] → return[sum]];
\\sum ← *plus[sum;eval[car[l]]];
\\l ← cdr[l];
\\go[a]]]

.END
Or %3and%* could be defined as:
.BEGIN SELECT 3;CENTERIT;GROUP;TABIT2(20,35);

←and <%4f%*= λ[[l]evand[l]]%1

where %3evand%* is defined as:

%3
\evand <= λ[[l][\null[l] → T;
\\eval[car[l]] → evand[cdr[l]];
\\T → NIL]]
%*

.END
Notice that this definition of %3evand%1 differs from the one on
{yon(P53)}.  The earlier definition required an explicit symbol table;
the newer one uses the global table. This is a mixed blessing. As usual
there are difficulties in getting the correct bindings of variables.
Most applications of %3FEXPR%*s involve explicit calls on %3eval%*
within the body of the %3FEXPR%*.
Consider the following sequence :

.BEGIN TABIT2(20,37);SELECT 3;TURN OFF "←";
\wrong <%4f%*= λ[[x] prog[[y]
\\y ← 2;
\\return[eval[car[x]]]]

\y ← 0;
\wrong[y];
.END;

Execution of the above sequence show the value of %3wrong[y]%* to be %32%*,
rather than the expected %30%*. Clearly, the problem is that the call on
%3eval%* takes place in the wrong environment. To alleviate this situation
%3FEXPR%*s may be defined with %2two%* arguments. In this case a %2call%*
on the %3FEXPR%* will bind the environment %2at the point of call%* to that
second argument. %3eval%*,and %3apply%* are allowed to be called with either
one or two arguments. If a second argument is present, it is used as the
environment during evaluation.
Thus:

.BEGIN TABIT2(20,39);SELECT 3;TURN OFF "←";
\right <%4f%*= λ[[x;a] prog[[y]
\\y ← 2;
\\return[eval[car[x];a]]]

\y ← 0;
\right[y];
.END;

The call on %3right%* will use the environment with %3y%* being %30%*
rather than %32%*.


Macros can be used to perform the operations which special forms
do, but with perhaps more efficiency. 
The idea of macro processing is not recent.  Some of the earliest assemblers
had extensive macro facilities.  Lately macros have been used as a means
of extending so-called high level languages.

One of the most simple kinds of macros is %2textual substitution%*.
That is, when a use of a macro is detected we simply replace the call by its
body.  A slightly more sophisticated appliation is the %2syntax macro%*.
Everytime we come across an application of a syntax macro the expander 
processes it as if it had never been seen before even though much
of the expansion is repetitious. That is, syntax macros have no memory.

%2computational macros%* are an attempt to reduce some of this repetition.
In this scheme a certain amount of processing is done at the time the macro
is %2defined%*. For example a computational subtree refelcting the body of the
macro might be formed.  Then whenever the macro is %2used%* we can
simply make a copy of this subtree and "glue" this subtree into the parse-tree
which we are building.  This computational subtree is commonly formed by passing
the body of the macro through the compiler in a 
"funny" way.  The main problem with the computational macro is that there are
many desirable macros which have no such subtree, or there is other information
 necessary to process the macro.  There are solutions to this problem,
one of which closely parallels the abstract syntax ideas of McCarthy.
All of these styles of macros are subsets of the LISP macros.

Many versions of LISP also have a very simple syntax macro (called ⊗→%3read%* macros↔←)
independent of
the <%4m%*= - facility. Constants appear quite frequently in LISP expressions;
 and so we have already introduced abbreviations for the sexpression
representation of numbers, %3T%* and %3NIL%*.  That is we don't %3QUOTE%* them.
%3read%* macros are used to abbreviate other sexpressions. The most
common %3read%* macro is used to abbreviate the construct:

.TURN ON "←";
←%3(QUOTE %*<sexpression>%3)%*.

A special character is chosen to designate the macro, say "@". Then
then whenever %3read%* sees "@" the next sexpression, s, is read in and
the value: %3(QUOTE %1s%*)%1 is returned by %3read%*. Thus:

←@<sexpression> is an abbreviation for %3(QUOTE <sexpression>).%1

In some systems the user may define his own %3read%* macros, in others
they are pre-defined.

.SS(Problems)

I.  Define %3list%* and %3append%* as macros. You may use only
the LISP primitives (functions, predicates, conditionals and
recursion) and a binary function, %3*append%*.

II. Give a macro definition of an extended %3SETQ%*, which is called 
as %3(SETQ#%1var%41%*#exp%41%*#...#var%4n%*#exp%4n%3)%1.
Each var%4i%* is a name; each exp%4i%* is an expression to be evaluated
and assigned to var%4i%1. The assignments should go from "left-to-right".

Thus %3(SETQ X 2 Y (TIMES 2 X) X 3)%* when executed should assign
%33%* to %3X%* and %34%* to %3Y%*.

III. Define %3list%* as a fexpr.

.SS(%aSM%*-%3eval%*,%3eval%*,P30:)
Finally, here is the new evaluator.

(the `line numbers' are not part of the definitions)
.BEGIN VERBATIM;SELECT 3;
eval <= 
1. λ[[x]prog[[y]
2.	return[[numberp[x] → x;
3.		atom[x] → [y ← get[car[x];VALUE] → cdr[y]]
  			   T → err[UNBOUND VARIABLE]];
4.		atom[car[x]] → [y ← getl[car[x];(EXPR FEXPR MACRO)]
5.				  → [eq[car[y];EXPR] → apply[cadr[y];mapcar[function[eval];cdr[x]]];
6.				     eq[car[y];FEXPR] → apply[cadr[y];list[cdr[x]]];
7.				     T → eval[apply[cadr[y];list[x]]]];
8.				y ← get[car[x];VALUE] → eval[cons[cdr[y];cdr[x]]];
9.				T → err[UNDEFINED FUNCTION]];
10.		T → apply[car[x];mapcar[function[eval];cdr[x]]]]]]]


apply <=
 λ[[fn;args]
11.	[atom[fn] → [get[fn;EXPR] → apply[get[fn;EXPR];args];
12.		     T → apply[eval[fn];args]];
13.	 eq[car[fn;LAMBDA] → prog[[z]
14.				bind[cadr[fn];args];
15.				z ← eval[caddr[fn]];
16.				unbind[];
17.				return[z]];
18.	 T → apply[eval[fn];args] ]]

.END;

First let's describe the new LISP system functions which are used here.
They are:

.BEGIN INDENT 0,10;
.p52:
%3get[x;y]: %*
%3x%* is an atom; %3y%* is an indicator.  ⊗→%3get%*↔← will return the value-part
of the attribute-value pair
associated with %3y%* under the atom %3x%*.  If %3x%* does not have the
indicator %3y%*, then %3NIL%* is returned.
.END

.BEGIN INDENT 0,10;
%3getl[x;y]:%*
%3x%* is an atom; %3y%* is a list of indicators (see line 4). ⊗→%3getl%*↔←
will search the property list of the atom %3x%* for the first occurrence
of an indicator which appears in the list %3y%*.  If such a match 
is found, then the remainder of the p-list, beginning with the
matching indicator, is returned.  If no match is found, %3NIL%* is
returned.
.END

.BEGIN CENTERIT;GROUP;
An example:
.GROUP SKIP 6;
←%3getl[FOO;(BAZ PNAME)]%* has value:      %3get[FOO;PNAME]%* has value:
.END
The virtue of %3getl%* is that it can distinguish between an atom which has 
an indicator with value %3NIL%* and an atom which does not have the indicator.
%3get%* cannot communicate this distinction.

First, on lines 5 and 10 we use the LISP mapping function, %3mapcar%*, to evaluate the argument
list during function calls. See {yon(P34)}.

Notice line %33.%* is an application of an obscene LISP coding trick
described on {yon(P28)}. Notice too, that line %33%* uses %3get%*. You
should understand why %3get%* works and %3getl%* is not required.

There are enough important new details in this %3eval-apply%* family
to warrant spending a bit of time. Most of the changes involve symbol table
operations.  %3eval%* (and %3apply%*) no longer carry an explicit symbol table.
The property-list of tha atom  contains the information now.
In this new scheme of things, %3get%* and %3getl%* search the symbol table
(p-list) for the appropriate bindings. 

%3eval%* first checks if it has been given a number; any other atomic
expression given to %3eval%* is expected to be a  variable and to
have a ⊗→%3VALUE%*-cell↔←.

If the %3car%* of the expression given to %3eval%* is atomic, then the
p-list of that atom is examined for one of three indicators.
We have already seen ⊗→%3EXPR%*↔← on {yon(P29)}, it is the indicator designating
a function represented as a λ-expression.
Evaluate the arguments and call %3apply%*. The indicator ⊗→%3FEXPR%*↔←,
designates a special
form. Pass the %2un%*evaluated argument list to %3apply%*.
The indicator, %3MACRO%*, is recognized on line %37%*.


There are actually two other indicators recognized by %3eval%*. ⊗→%3SUBR%*↔← is
the machine-language version of an %3EXPR%*; and ⊗→%3FSUBR%*↔← is the machine
version of a %3FEXPR%*.

.SS(Binding revisited,bind,P78:)

The most interesting changes in the evaluator involve binding
and unbinding of variables.

We have seen that the old symbol table mechanism has the correct
effect for the proper evaluation of recursive functions.  As 
we bound variables to values on entry to a λ-expression, we
%3cons%*ed the new bindings onto the front of the symbol table.  This
had two results:

%21.%*  In the evaluation of the body of the λ-expression we got the
correct bindings of the variables.

%22.%*  The old value was saved so that we could retrieve it on leaving
the λ-body.

Now with the new table organization we need to develop the same
facility.  It is not sufficient for the binding of λ-variables to
simply change the contents of the ⊗→%3VALUE%*-cell↔←. This would violate
condition %22%*. Besides changing the value, we must save the prior
value.

The crucial phrases are lines 14 and 16 in the new version of %3apply%*.
⊗→%3bind%*↔← is a function, taking two arguments.  The first is the list of
λ-variables; the second is the list of new values.  %3bind%* saves the 
old values and rebinds the λ-variables to the new values.
⊗→%3unbind%*↔← is a function of no arguments used to restore a list
of λ-variables to their previous values.  How is 
this wonderous feat performed?

We have a ⊗→stack↔←, or ⊗→pushdown-list↔←, called SP 
(for %2S%*pecial %2P%*ushdown) in which we can 
save old values of variables.  %3bind%* will add values to SP; %3unbind%*
restores the saved values.  Actually because of the stack-like
behavior of the binding and unbinding processes we can increase
the efficiency of %3bind%* and %3unbind%*.  %3bind%* saves both the value
and the location of the %3VALUE%*-cell for each variable being rebound.
It also puts special marks in the SP stack delimiting each block
of λ-rebindings.  All %3unbind%* need do is restore the first block
of saved values.  It knows the values and also knows the addresses of
the %3VALUE%*-cells.  
All of the information it needed for restoration is saved in the stack.

.BEGIN GROUP;TABIT3(30,45,60);

Given %1λ[[x%41%*; ...; x%4n%*]%9x%1][a%41%*; ... ;a%4n%*], here's an example of binding:


\        to value\\old value of x%41%*
\       cell for x%41%*

\save 
\and\  ...
\rebind
\  ===>

\        to value\\old value of x%4n%*
\       cell for x%4n%*





\\             |
\\             ↓
\\rebind x%4i%* to a%4i%*, changing the
\\  contents of the value cell
\\      (see %3*bind%*)

\\             |
\\             ↓
\\        evaluate %9x%*

\\             |
\\             ↓
\\restore old values using information
\\    stored in satck, SP.
\\       (see %3*unbind%*)
.END

.BEGIN CENTERIT;
.GROUP SKIP 15;
←%2The primitives, %3*bind%* and %3*unbind%*.
.END

This binding scheme, called %2⊗→shallow binding↔←%* works quite well for 
simple λ-binding and lookup. As with the previous a-list symbol table
we need excercise some care when handling functional arguments. How can we
implement the equivalent of the %3FUNARG%* hack in this new binding and symbol table
organization?  Recall how the old %3eval%* coped ({yon(P79)}).
When we recognized  an instance of a functional argument we saved a pointer to 
the current envirnoment. When we %2applied%*
the functional argument we restored the symbol table in such a way
that global variables were accessed in the saved environment while
local variables were found in the current evnironment.
We must do the same. When %3function%* is seen save the current SP pointer;
when we apply the functional argument, 
.BEGIN centerit;

%3←(FUNARG %*<function> <old SP>%*) %1to     arg%41%*; ... arg%4n%3 ,

.END
we will first rebind all of the variables found by the old SP pointer to
those old values, while saving the current values in SP. Then we  can rebind the
λ-variables (i.e., local variables) of the <function> to arg%41%* through
arg%4n%*. The environment will then be properly restored for evaluation of the
function body. When the evaluation  has been completed we restore using %3unbind%*.

Compare the shallow binding strategy to that of the a-list. The a-list was always
explicit when evaluation was occurring; saving environments only required saving
a pointer to the current symbol table; changing envirnoments at most required
calling %3eval%* with a different symbol table, and when leaving a context
the old table was restored implicitly by the recursion mechanism. Accessing
a variable involved an  appreciable computation; we had to search the table
for the variable-value pair.

Shallow binding obviates the symbol table search while incurring added 
expense in binding and unbinding.
The care and feeding of functional arguments is more difficult since there
is only one symbol table, not the tree of tables implicit in the 
previous incarnation. True, the information necessary to recover an environment
is present in SP, but it is expensive to retrieve it.

Shallow binding: 0; a-list: 1; (Christians: 0; lions: 1).

.<<PROBS OF BINDING AND FEXPRS>>
.BEGIN TURN ON "←";GROUP;
←%2Problems%*

I  Evaluate the following using the new %3eval%*:

1. %3eval[((LAMBDA(L)(PRINT L))(QUOTE A))]%* assuming %3print%* is the usual printing function.

2. %3eval[(FOO (QUOTE A))]%* where %3foo <= λ[[l]print[l]]%*.

3. %3eval[((CAR (QUOTE (FOO)))(QUOTE A))]%*, %3foo%* as above.

4. %3eval[(FOO1 (QUOTE A))]%* where %3foo1 <%4f%*= λ[[l]print[l]]%1.

and finally:

5. %3eval[((CAR (QUOTE (FOO1)))(QUOTE A))]%*.

Explain what happened. Can you "fix" it?

II Some implementations of LISP  distinguish between %3EXPR%*s and %3FEXPR%*s
by modifying the prefix, %3LAMBDA%*. %3FEXPR%*s are stored with prefix %3NLAMBDA%*;
%3EXPR%*s are stored with the normal prefix %3LAMBDA%*. What are the
advantages and/or disadvantages of this scheme.

III Recall the  extensions to LISP binding proposed in {yonss(P67)}.
Discuss their implementation in light of the new %3eval%* and symbol table.

.END
.SS(Review)
As things presently stand we have a  basic LISP machine consisting
of an encoding of %3eval%* and friends, the I/O routines, the garbage
collector, an initial symbol table which believes in the primitive 
functions and  some commonly needed utility functions.  We have
two areas of memory: free space, and full word space.

Expressions (forms)  to be evaluated are read in, converted to list structure,
and evaluated by %3eval%*.  What %3eval%* is doing is traversing the 
sexpression representation of the form, interpreting the 
information found there as LISP "instructions".  This is an expensive process.
In the general case there is nothing we can do to reduce this cost, but very
frequently there is a mapping from the LISP expressions to be 
evaluated to a sequence of actual machine instructions, which when
carried out will have the same computational effect as %3eval%*, massaging
the list structure.  This mapping is called a ⊗→compiler↔←.  %3eval%*
and friends are called an %3interpreter%*.  

What we wish to do now is describe a compiler for a subset of LISP.
This has many motivations:

%21.%*  To describe the compiler we must carefully define the machine
which will execute the instructions which the compile produces.
Examination of a machine will reinforce the ideas of stacks, make
more concrete the structure of recursion, of garbage collection and
representation of sexprs and list structure.

%22.%*  It will  show a very non-trivial application of non-numerical
computation. At the same time we will see how simple it is to
describe such complex algorithms in LISP.

%23.%*  It will clearly show the relationship between compilation and
evaluation.  That is, the LISP function representing the compiler
will very closely parallel the structure of the evaluator, %3eval%*.
If you understand %3eval%*, then the compiler is easy.

%24.%*  Everyone should see a compiler ("%2Shut up!%*", he explained).



.SS(%aSM%*:A Simple machine,%aSM%*,P33:)
This section  describes a simple machine which
has a sufficient instruction set to handle an implementation of LISP.
Before we begin, note that this machine is %2not%* necessary for our
understanding of %3eval%*. %3eval%* is self-contained. We need only describe
a machine to discuss implementation and compilation.  This,
indeed is an objection to describing meaning of programming languages
in terms of a compiler -- you must understand %2two%* things now -- the
language %2and%* a machine.

The simple machine, %aSM%*, has a slight similarity to the organization of the
PDP-10. We need very few features to adequately discuss the interesting 
facets of implementation
of our LISP subset. Certainly, if we were to undertake a real implementation, many
more instructions would be necessary. Similarly, when we discuss compilation
our %aSM%* suffices, but if we wished to perform %2efficient%* compilation
we would hopefully have a better instruction set.  The point here
is to understand basic algorithms. If that is accomplished it is quite
straightforward to examine problems of efficiency, and details of implementation.

%aSM%* has a conventional addressable main memory, and n special registers,
AC1, AC2, ..., ACn. These registers are called %2accumulators%*
and as with the AMBIT/G description, these n registers are to contain pointers
to the arguments of a function at the time of invocation.
Each memory location or register is assumed to be large enough to contain
two addresses. For sake of discussion, assume the word size is 36 bits.
Assume the addressing space is then 2%818%*. 
The mapping of a %7[~~~]~~~]%* onto a %aSM%* location is easy; the %3car%*
maps to the left-half of the word; the %3cdr%*, to the right.
A memory area to contain full-words (%7 [~~~~~~]%* ) is designated. 
All p-names and numbers are stored here.

Parts of %aSM%* memory can be designated as stacks. Each stack is a contiguous
area of memory, and the current top of the stack is referenced  by a general
register, P1, ..., Pn, called a %2stack-%* or %2pushdown-%* pointer.
The stacks will be used to contain the partial results of calculations and
will contain the information necessary to implement recursive function calling.
The primary LISP stack is named P.

There are only three main classes of instructions necessary to describe
LISP implementation: instructions for constant generation, instructions for
stack manipulation, and instructions for control of flow.

The control instructions and some of the stack instructions refer to the 
program counter of %aSM%*. %aPC%* designates this counter. %3C%* in the following
means "contents of...".

.BEGIN TABIT1(19);TURN OFF "←";

MOVEI ACi const\%3C%1(ACi) ← const

PUSH P ACj\%3C%*(%3C%*(P)) ← %3C%*(ACj)
\%3C%*(P) ← %3C%*(P)+1. Push contents of ACj onto top of stack.

POP P ACj\%3C%*(P) ← %3C%*(P)-1
\%3C%*(ACj) ← %3C%1(%3C%*(P)). Pop top of stack into ACj.

POPJ P\P ← %3C%*(%3C%*(P))-1
\%3C%*(%aPC%*) ← %3C%*(%3C%*(P)). Pop top of stack into %aPC%*; used as return.

.BEGIN INDENT 0,19;FILL;
CALL i fn\This instruction handles function calling. i is the number of arguments
(assumed to be in AC1 through ACi at time of call). fn represents a function name.
One effect of CALL is to leave return information on the stack, P.
.END

MOVE ACi -j P\%3C%*(ACi) ← copy of the j%8th%* element (counting from zero%8th%*) of stack P.

MOVEM ACi -j P\j%8th%* element of stack P is replaced by contents of ACi.

.BEGIN INDENT 0,19;FILL;
SUB x y\%3C%*(x) ← %3C%*(x) - %3C%*(y); Used in the context, SUB P const, to remove
a block of cells from the top of the stack.
.END

JRST j\%3C%*(%aPC%*) ← j. Go to loc j.

JUMPE ACi j\%2if%* %3C%*(ACi)=0 %2then%* %3C%*(%aPC%*) ← j;

JUMPN ACi j\%2if%* %3C%*(ACi)%c≠%*0 %2then%* %3C%*(%aPC%*) ← j;

.END
For much of the sequel, the only calling sequence of interest will be
ordinary function calls.  The evaluated arguments are to be loaded into
the accumulators.
Internal λ-expressions are not handled yet. (See problem IV on {yon(P36)}).
More instructions for %aSM%* will be given on {yon(P35)} when we
discuss efficiency.

.SS(An alternative: deep bindings,deep binding)
There are some weaknesses in the model of implementation espoused
in the previous sections.  First, as we have seen  ⊗→shallow binding↔← requires
that we be a bit careful in handling functional arguments. 
Also there is an appreciable overhead in changing contexts.

Second, our strategy of requiring argument passing in a fixed number of
registers, AC%41%* through AC%4n%*, is unnecessarily restrictive. There
will be some arbitrary limit on the number of arguments which a function
may have. After a moment's reflection on the representation of the symbol table
as a stack {yon(P43)}, it should be apparent that we might try passing the
arguments to a function requiring %3n%* arguments as the first %3n%* elements
of a stack. The value returned by the function could be defined to be the
value located in the top of the stack. This holds promise. Consider the
evaluation of a form 
.BEGIN CENTERIT;SELECT 3;
←f[g%41%*[ ... ]; ... g%4n%*[ ... ]]  .
.END
The evaluation of the arguments is to proceed  from left to right. If we perform
the procedure "calculate the argument; leave value in top of stack", then %3f%*
could expect to find values for its λ-variables  as:

.BEGIN TABIT3(20,40,60);GROUP;

\value stack:\|  value for %3x%4n%1\|
\\|  value for %3x%4n-1%1\|
\\|       ...   ... \|
\\|  value for %3x%41%1\|
.END

There will be no problem about searching to find the values; %3f%* "knows" 
where to find the values in the stack. When %3f%* is finished its computation
it need only remove the top n elements from the value stack and add the value
which it computed.
This scheme seems to have the advantage of shallow binding: fast access
to values, but none of the disadvantages. It looks like the a-list scheme
for binding and unbinding. What's the problem? It's global variables.

If %3f%* wants to access a global variable how can it be found?  All
we have stored on the stack is the %2value%* and there is no way that %3f%*
can "know" %2which%* value is the value of the global variable.
Surely we can solve this problem: simply store %2pairs%*##---##name-value pairs.
So what's the problem now? The expensive calculation only arises when we
access global variables. Most variables %2are%* local aren't they ---
or are they? Function names are variables and are almost always global;
%3car%* is a variable name, whose "value" is a primitive routine to
compute the %3car%* function. Seldom do you wish to redefine %3car%*.

Searching the stack for name-value pairs %2is%* more expensive than 
picking up the value cell of the shallow binding scheme. We will soon see
that LISP compilers for deep binding %2can%* be made efficient
in their access of local and global variables; but  the interpreter
%2will%* have to search. One of the worst cases will be the execution of a loop,
in which accesses to the same variables occur frequently.
This, perhaps is another good reason for removing control of iteration
from the hands of the programmer.  The extent of (or even the presence of) a loop
which the user is 
controlling by tests and goto's is difficult to discover.  If a loop
is controlled by language constructs (WHILE, REPEAT,  etc.) then the 
interpreter might have some chance of improving the execution of the loop.

As we have previously said, deep binding is very reminiscent of the a-list
symbol table structure which was first a stack, then embellished to become
a tree when functional arguments appeared. But recall that the a-list scheme
had each name-value pair chained to its successor. Pointers into the table
structure (environment pointers) entered into this chain at various placees
to define an environment. With the a-list scheme, we were able to generate a
tree structure, but deep binding, so far, still has a stack-like structure.
Something must be done. Let's look at the a-list %3eval%* again.

First note that the environment pointers we generated for handling
functional arguments point into the symbol tables in very specific places; 
they %2always%* point into the tables at the beginning of λ-bindings. They will
%2never%* point into  the interior of a block of bindings. Thus each
%2block%* of λ-bindings can go onto a stack in a contiguous fashion.
Now recall the Weizenbaum environments: each environment has a local
symbol table (λ-variables, and %3prog%*-variables); each environment
also has entries for E%4a%* and E%4c%*. So we may simulate a tree on the 
name-value stack if we include in each block of bindings, pointers
representing E%4c%* and E%4a%*. Thus the search strategy becomes yet a bit
more complex: when seeking a value for a variable run down the name-value
stack comparing names; when the block is exhausted we follow the E%4a%*-pointer.
To exit a block is easy and in fact much easier than the shallow binding
ritual: simply restore the E%4c%*-pointer.

***more,more***
.END "JRA";